Applications of Regular Algebra to Language Theory Problems. Roland Backhouse February 2001

Similar documents
Fusion on Languages. Roland Backhouse. University of Nottingham

Language-Processing Problems. Roland Backhouse DIMACS, 8th July, 2003

Regular algebra applied to language problems

Galois Connections. Roland Backhouse 3rd December, 2002

Introduction to Kleene Algebras

Properties of Context-Free Languages

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Axioms of Kleene Algebra

Properties of context-free Languages

Completeness of Star-Continuity

Mathematical Preliminaries. Sipser pages 1-28

Section Summary. Relations and Functions Properties of Relations. Combining Relations

Definition: A binary relation R from a set A to a set B is a subset R A B. Example:

Equational Logic. Chapter Syntax Terms and Term Algebras

NPDA, CFG equivalence

1.A Sets, Relations, Graphs, and Functions 1.A.1 Set a collection of objects(element) Let A be a set and a be an elements in A, then we write a A.

Introduction to Kleene Algebra Lecture 13 CS786 Spring 2004 March 15, 2004

This lecture covers Chapter 7 of HMU: Properties of CFLs

Part II. Logic and Set Theory. Year

CFG Simplification. (simplify) 1. Eliminate useless symbols 2. Eliminate -productions 3. Eliminate unit productions

2MA105 Algebraic Structures I

Context Free Grammars

Ogden s Lemma for CFLs

Properties of Context-Free Languages. Closure Properties Decision Properties

Nondeterministic Finite Automata

Properties of Context-free Languages. Reading: Chapter 7

Lecture Notes: Selected Topics in Discrete Structures. Ulf Nilsson

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Universal Algebra for Logics

An Overview of Residuated Kleene Algebras and Lattices Peter Jipsen Chapman University, California. 2. Background: Semirings and Kleene algebras

Non-context-Free Languages. CS215, Lecture 5 c

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

On Fixed Point Equations over Commutative Semirings

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Foundations of Informatics: a Bridging Course

Automata Theory and Formal Grammars: Lecture 1

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Theory of Computation 8 Deterministic Membership Testing

3515ICT: Theory of Computation. Regular languages

MA/CSSE 474 Theory of Computation

Characterizing the Equational Theory

Lecture 17: Language Recognition

Classes of Boolean Functions

Equivalence of Regular Expressions and FSMs

Einführung in die Computerlinguistik

Sri vidya college of engineering and technology

A strongly rigid binary relation

What is this course about?

CPSC 421: Tutorial #1

CS 154, Lecture 4: Limitations on DFAs (I), Pumping Lemma, Minimizing DFAs

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Formal Languages, Automata and Models of Computation

where A, B, C N, a Σ, S ϵ is in P iff ϵ L(G), and S does not occur on the right-hand side of any production. 3.6 The Greibach Normal Form

A Weak Bisimulation for Weighted Automata

A Graph Based Parsing Algorithm for Context-free Languages

Kleene Algebras and Algebraic Path Problems

From Sequential Algebra to Kleene Algebra: Interval Modalities and Duration Calculus. Peter Höfner. Report Juli 2005

Chap. 7 Properties of Context-free Languages

Notes for Comp 497 (Comp 454) Week 10 4/5/05

CYK Algorithm for Parsing General Context-Free Grammars

Strong Deterministic Fuzzy Automata

Finite Presentations of Pregroups and the Identity Problem

CS 373: Theory of Computation. Fall 2010

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Recurrence Relations and Recursion: MATH 180

straight segment and the symbol b representing a corner, the strings ababaab, babaaba and abaabab represent the same shape. In order to learn a model,

CFGs and PDAs are Equivalent. We provide algorithms to convert a CFG to a PDA and vice versa.

Notes on ordinals and cardinals

h(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote

ANNIHILATOR IDEALS IN ALMOST SEMILATTICE

XMA2C011, Annual Examination 2012: Worked Solutions

Mathematical Foundations of Logic and Functional Programming

Lecture 5: Minimizing DFAs

Continuity. Chapter 4

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

An Algebraic Approach to Energy Problems I -Continuous Kleene ω-algebras

Jónsson posets and unary Jónsson algebras

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

What we have done so far

Ogden s Lemma. and Formal Languages. Automata Theory CS 573. The proof is similar but more fussy. than the proof of the PL4CFL.

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Duality in Logic. Duality in Logic. Lecture 2. Mai Gehrke. Université Paris 7 and CNRS. {ε} A ((ab) (ba) ) (ab) + (ba) +

Chapter 1. Sets and Mappings

Context-free Grammars and Languages

Monoids. Definition: A binary operation on a set M is a function : M M M. Examples:

Functions on languages:

Theory Bridge Exam Example Questions

1 Functions of Several Variables 2019 v2

Advanced Automata Theory 7 Automatic Functions

Advanced Automata Theory 9 Automatic Structures in General

NOTES ON AUTOMATA. Date: April 29,

Relations on Hypergraphs

Lecture #14: NP-Completeness (Chapter 34 Old Edition Chapter 36) Discussion here is from the old edition.

Chapter 1 The Real Numbers

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Tense Operators on Basic Algebras

ON SOME BASIC CONSTRUCTIONS IN CATEGORIES OF QUANTALE-VALUED SUP-LATTICES. 1. Introduction

FUNCTORS AND ADJUNCTIONS. 1. Functors

Iterated Galois connections in arithmetic and linguistics. J. Lambek, McGill University

Transcription:

1 Applications of Regular Algebra to Language Theory Problems Roland Backhouse February 2001

Introduction 2 Examples: Path-finding problems. Membership problem for context-free languages. Error repair. Programming Method: Express problem as solving a system of (recursive) equations. Solve the equations using eg iterative approximation or elimination technique.

Examples 3 S ::= ass ε Is-empty S φ ({a} φ S φ S φ) {ε} φ Nullable ε S (ε {a} ε S ε S) ε {ε} Shortest word length #S = (#a + #S + #S) #ε

Non-Example 4 S ::= ass ε ε S (ε {a} ε S ε S) ε {ε} but aa S (aa {a} aa S aa S) aa {ε}

Problem-Solving Strategy 5 1. Express problem as solving a system of equations. 2. Solve equations using appropriate algorithm (iteration, elimination, Knuth s). Constructing System of Equations When is a function on context-free languages expressible by a system of equations with the same structure as the context-free grammar? Measure on words is extended to a measure on sets. Range of measure is a regular algebra. Measure on words is compositional.

Theory 6 Fixed Point Calculus Galois Connections Regular Algebra

Fixed Points 7 S ::= ass ε. S = µ X:: {a} X X {ε}. µf denotes the least fixed point of (monotonic) endofunction f. We sometimes write µ f, using the subscript to indicate the ordering relation. X: R: E denotes the function mapping values X in range R to E. The range R may be omitted if it is understood from the context.

Galois Connections 8 Suppose A = (A, ) and B = (B, ) are partially ordered sets and suppose F A B and G B A. Then (F, G) is a Galois connection between A and B iff, for all x B and y A, F.x y x G.y. We refer to F as the lower adjoint and to G as the upper adjoint. Examples Floor function: Negation: Maximum: n x n x. p q p q. x y z x z y z.

Examples (Continued) 9 Let Σ k denote the set of all words over alphabet Σ of length at least k. Let #S denote the length of a shortest word in the language S. #S k S Σ k.

Fundamental Theorem 10 Suppose that B is a poset and A is a complete poset. Then a monotonic function F A B is a lower adjoint in a Galois connection equivales F is sup-preserving. Examples Let S denote a bag of sets. Then S = φ S: S S: S φ x S S: S S: x S.

Unity of Opposites 11 Suppose F A B and G B A are Galois connected functions, F being the lower adjoint and G being the upper adjoint. Then F.B and G.A are isomorphic posets. Moreover, if one of A or B is C-complete, for some shape poset C, then F.B and G.A are also C-complete.

Fusion Theorem 12 Suppose f A B is the lower adjoint in a Galois connection between the complete posets (A, ) and (B, ). Suppose also that g (B, ) (B, ) and h (A, ) (A, ) are monotonic functions. Then f.µ g = µ h f g = h f. f g denotes the composition of functions f and g and f.x denotes application of function f to x.

An (Elementary) Application 13 Consider grammar S ::= as SS ε. We want to write x S as a fixed point equation. That is, we want to construct g such that x S µg. Recall: x S S: S S: x S. So, for all x, the boolean-valued function (x ) is a lower adjoint. Also, S = µf where f maps set X to {a} X X X {ε}. Applying fusion theorem, (x µf µg) S:: x f.s g.(x S).

An Application the empty word. 14 ε f.s = { definition of f } ε ({a} S S S {ε}) = { membership distributes through set union } ε {a} S ε S S ε {ε} = { ε X Y ε X ε Y } (ε {a} ε S) (ε S ε S) ε {ε} = { g.b = (ε {a} b) (b b) ε {ε}, see below for why the rhs has not been simplified further } g.(ε S). Thus, ε µ X:: {a} X X X {ε} µ b:: (ε {a} b) (b b) ε {ε}.

An Application not the empty word. 15 a f.s = { definition of f } a ({a} S S S {ε}) = { membership distributes through set union } a {a} S a S S a {ε} = { a X Y a X a Y }???. Calculation cannot be completed!!

Fusion Theorem 16 f.µg = µh f g = h f. provided that 1. f is a lower adjoint 2. f g = h f Strategy: f is the extension, ^m, to languages of a measure m on words. The word and language measures m and ^m are constructed so that: 1. is automatically true, and 2. is true if m.(uv) = m.u m.v. The range of m is a regular algebra. Problem generalisation to a more sophisticated regular algebra is often needed in order to implement the strategy.

Measure m on word u 17 #u (length of u), true, X = u. Extension ^m to language S #S = u: u S: #u, S φ u: u S: true, X S u: u S: X = u.

Regular Algebra 18 A regular algebra is a tuple (A,,,, 0, 1) where (a) (A,, 1) is a monoid, (b) (A,,, 0) is a complete, universally distributive lattice with least element 0 and binary supremum operator, (c) for all a A, the endofunctions (a ) and ( a) are both lower adjoints in Galois connections between (A, ) and itself. (Omitting universal distributivity, this is a Standard Kleene Algebra, Conway 1971.)

Examples 19 where IB = ({T,F},,,, F, T). Cost = (IR 0 { },,,,, 0) x y = if x = y = x y x+y fi. Bottleneck = (IR {, },,,,, ). Cost Bottleneck where the ordering on pairs is lexicographic. Non-Example Bottleneck Cost. where the ordering on pairs is lexicographic.

Regular Homomorphism 20 Let R = (R,,,, 0 R, 1 R ) and S = (S,, +,, 0 S, 1 S ) be regular algebras. Suppose m is a function with domain R and range S. Then m is said to be a regular homomorphism if m is a monoid homomorphism (from (R,, 1 R ) to (S,, 1 S )) and it is a lower adjoint in a Galois connection between the two orderings.

Extending Measures 21 Suppose that (M,, 1 M ) is a monoid and that R = (R,, +,, 0 R, 1 R ) is a regular algebra. Suppose m is a function with domain M and range R. Consider the power set algebra (2 M,,,, φ, {1 M }). Define ^m, the extension of m to subsets of M (elements of 2 M ), by ^m.s = Σ x: x S: m.x. Examples #S, S φ, X S. Lemma ^m is a regular homomorphism equivales m is a monoid homomorphism.

Interpreting a Grammar 22 Suppose G = (N,T,P,S) is a context-free grammar. Suppose R = (R,, +,, 0 R, 1 R ) is a regular algebra. Suppose m is a function with range R and domain T. Then the interpretation of G in R under m is the endofunction on R N obtained by interpreting terminal symbols via m, concatenation (on the rhs of productions) as, choice between productions as +, and the empty word as 1 R. Examples S ::= ass ε. Interpretation of G in the regular algebra of languages under the function that maps t T to {t} X:: {a} X X {ε}. Interpretation of G in IB under the function that maps t to true: b:: (true b b) true

Theorem 23 Suppose G = (N,T,P,S) is a context-free grammar. Suppose R = (R,, +,, 0 R, 1 R ) is a regular algebra. Suppose m is a monoid homomorphism to (R,, 1 R ) from (T,, ε). Suppose ^m is the extension of m to the regular algebra of languages over alphabet T. Suppose f is the interpretation of G in the regular algebra of languages under the function that maps t T to {t}. Suppose g is the interpretation of G in R under m. Then ^m. µf = µg. Example Nullability ε µ X:: {a} X X X {ε} µ b:: (ε {a} b) (b b) ε {ε}

General CFG Recognition 24 Given word X and context-free grammar G, determine whether X is a word in the language generated by G. Consider measure defined by extending m where m.u X = u. This the function (X ) on languages. Problem: the function m is a monoid homomorphism equivales X = ε. Solution: generalise the problem so that the range of m is a graph algebra.

Graphs 25 Suppose r is a binary relation and suppose A is a set. A (labelled) graph of dimension r over A is a function f with domain r and range A. Elements of relation r are called edges. We will use G r A to denote the class of all labelled graphs of dimension r over A. If f is a graph and the pair (i, j) is an element of r, then i f j will be used to denote the application of f to the pair (i, j).

Addition and Product of Graphs 26 Suppose R = (A,, +,, 0, 1) is a regular algebra. Then zero and the addition and product operators of R can be extended to graphs as follows. Two graphs f and g of the same dimension r can be ordered according to the rule: for all pairs (i, j) in r f g i, j:: i f j i g j. The supremum ordering is just pointwise. In particular, f and g of the same dimension r are added according to the rule: for all pairs (i, j) in r i f +g j = i f j + i g j. Two graphs f and g of dimensions r and s can be multiplied to form a graph of dimension r s according to the rule: for all pairs (i, j) in r s i f g j = k: (i, k) r (k, j) s: i f k k g j. Finally, the zero graph, denoted by 0, is defined by: for all pairs (i, j) in r, i 0 j = 0.

Graph Regular Algebras 27 Suppose R = (A,, +,, 0, 1) is a regular algebra with carrier A, and suppose r is a reflexive, transitive relation. Define an ordering, addition and product operators as above. Define the unit graph, denoted by 1, by i 1 j = if i = j 1 i j 0 fi. (Note that G r A is closed under the product operation and contains 1 on account of the assumptions that r is transitive and reflexive, respectively.) Then the algebra G r R = (G r A,, +,, 0, 1) so defined is a regular algebra.

Cocke, Kasami, Younger 28 Let X be a given word (the input string) and let N be the length of X. We use X to define a measure on words and then we extend the measure to sets and then to vectors of sets. The measure of word u is a graph of Booleans that determines which segments of X are equal to u. Index the symbols of X from 0 onwards. The edge relation of the graph is the set of pairs (i,j) such that 0 i j N and will be denoted by seg. Now, with (i,j) in the relation seg, let X[i..j) denote the segment of word X beginning at index i and ending at index j 1. Now define m.u = i, j:: X[i..j) = u. This defines m.u to be a boolean graph of dimension seg. Moreover, m is a monoid homomorphism and seg is reflexive and transitive. ^m.s = i, j:: u: u S: X[i..j) = u so that 0 ^m.s N X S.

Error Repair 29 Let X be a given word (the input string) and let N be the length of X. Problem: Determine the minimum number of insert, delete and/or change operations needed to edit X into a word in the language generated by context-free grammar G. As in Cocke, Younger, Kasami, we use X to define a measure on words and then we extend the measure to sets. The measure of word u is a triangular graph of numbers that determines how many edit operations are required to transform each segment of X to the word u.

Edit Distance 30 Let dist(u,v) denote the minimum number of non-ok edit operations needed to transform word u into word v using a sequence of the above edit operations. Now define m.u = i, j:: dist(x[i..j), u). This defines m.u to be a graph of numbers. The numbers, augmented by, form the min-cost regular algebra discussed earlier. Thus graphs over numbers also form a regular algebra. Taking this as the range algebra, the extension of the measure m to sets is ^m.s = i, j:: u: u S: dist(x[i..j), u) so that 0 ^m.s N is the minimum number of edit operations required to repair the word X to a word in S. Problem: m.ε is not the unit of multiplication. But, the function m does distribute through concatenation.

Compositional 31 Let R = (R,, 1 R ) and S = (S,, 1 S ) be monoids. Suppose m is a function with domain R and range S. Then m is said to be compositional if, for all x and y in R, m.(x y) = m.x m.y.

Using the Unity of Opposites 32 Let R = (R,,,, 0 R, 1 R ) and S = (S,, +,, 0 S, 1 S ) be regular algebras. Suppose m is a function with domain R and range S that is compositional and is a lower adjoint in a Galois connection between the orderings. Let m.r be the image of R under m and let m denote its upper adjoint. Then m.r = (m.r,,,, 0 S, m.1 R ) is a regular algebra, where x y = m.(m.x m.y). Moreover, m is a regular homomorphism from R to m.r. Proof Much of the proof of this theorem is covered by the unity-of-opposites theorem the theorem tells us that (m.r, ) is a complete lattice with binary supremum operator as defined above and least element 0 S. To show that m.r is a regular algebra it thus suffices to show that m.r admits left and right division operators.

Proof (Continued) 33 Suppose a\b denotes right division in S. Note that m.x \ m.y is not necessarily in m.r. But, m.y m.x \ m.z { cancellation: m.(m.s) s } m.y m.(m.(m.x \ m.z)) { monotonicity of m } y m.(m.x \ m.z) = { Galois connection } m.y m.x \ m.z. Thus, in m.r right division is given by the rule m.x m.y m.z m.y m.(m.(m.x \ m.z)). Left division is defined similarly.

Conclusion 34 Heuristic for problem generalisation based on algebraic properties.