Algorithms for NLP

Similar documents
Lecture 1: Finite State Automaton

Automata and Languages

Inf2A: Converting from NFAs to DFAs and Closure Properties

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

CS21 Decidability and Tractability

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

CS21 Decidability and Tractability

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Lecture 4 Nondeterministic Finite Accepters

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

FS Properties and FSTs

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Theory of Computation (I) Yijia Chen Fudan University

Nondeterministic Finite Automata

CS243, Logic and Computation Nondeterministic finite automata

Sri vidya college of engineering and technology

Parsing Regular Expressions and Regular Grammars

CMSC 330: Organization of Programming Languages

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Lecture 2: Regular Expression

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

Foundations of

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

Theory of Computation

Introduction to Formal Languages, Automata and Computability p.1/51

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Chap. 1.2 NonDeterministic Finite Automata (NFA)

Lecture 3: Nondeterministic Finite Automata

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

Nondeterministic Finite Automata

Finite Automata and Formal Languages

Intro to Theory of Computation

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

September 7, Formal Definition of a Nondeterministic Finite Automaton

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F)

CS 455/555: Finite automata

Finite Automata and Regular languages

Lecture 23 : Nondeterministic Finite Automata DRAFT Connection between Regular Expressions and Finite Automata

Regular Expressions. Definitions Equivalence to Finite Automata

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Finite Automata. BİL405 - Automata Theory and Formal Languages 1

2017/08/29 Chapter 1.2 in Sipser Ø Announcement:

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

Finite Universes. L is a fixed-length language if it has length n for some

CS 154, Lecture 3: DFA NFA, Regular Expressions

CSE 311 Lecture 25: Relating NFAs, DFAs, and Regular Expressions. Emina Torlak and Kevin Zatloukal

CS 581: Introduction to the Theory of Computation! Lecture 1!

Chapter 6: NFA Applications

Formal Definition of a Finite Automaton. August 26, 2013

Chap. 2 Finite Automata

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

Regular Language Equivalence and DFA Minimization. Equivalence of Two Regular Languages DFA Minimization

Classes and conversions

Non-deterministic Finite Automata (NFAs)

Chapter Five: Nondeterministic Finite Automata

Extended transition function of a DFA

Computational Models Lecture 2 1

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Nondeterministic Finite Automata

Computational Models Lecture 2 1

3515ICT: Theory of Computation. Regular languages

Deterministic Finite Automaton (DFA)

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Examples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application

Nondeterministic finite automata

CSE 460: Computabilty and Formal Languages. S. Pramanik

Harvard CS 121 and CSCI E-207 Lecture 6: Regular Languages and Countability

Regular Languages. Kleene Theorem I. Proving Kleene Theorem. Kleene Theorem. Proving Kleene Theorem. Proving Kleene Theorem

Lecture 17: Language Recognition

Finite Automata. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Nondeterministic Finite Automata

Formal Languages. We ll use the English language as a running example.

Lecture 5: Minimizing DFAs

DFA to Regular Expressions

Decision, Computation and Language

Finite Automata. Seungjin Choi

ECS 120 Lesson 15 Turing Machines, Pt. 1

Regular Expressions and Language Properties

CSE 211. Pushdown Automata. CSE 211 (Theory of Computation) Atif Hasan Rahman

Non-Deterministic Finite Automata

Equivalence of DFAs and NFAs

Theory of Computation Lecture 1. Dr. Nahla Belal

jflap demo Regular expressions Pumping lemma Turing Machines Sections 12.4 and 12.5 in the text

CSC236 Week 11. Larry Zhang

Chap. 2 Finite Automata

Regular Expression Unit 1 chapter 3. Unit 1: Chapter 3

Finite Automata. Dr. Neil T. Dantam. Fall CSCI-561, Colorado School of Mines. Dantam (Mines CSCI-561) Finite Automata Fall / 35

CPS 220 Theory of Computation Pushdown Automata (PDA)

COMP4141 Theory of Computation

NFA and regex. the Boolean algebra of languages. regular expressions. Informatics 1 School of Informatics, University of Edinburgh

UNIT-III REGULAR LANGUAGES

Nondeterminism. September 7, Nondeterminism

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

Transcription:

Regular Expressions Chris Dyer Algorithms for NLP 11-711 Adapted from materials from Alon Lavie

Goals of Today s Lecture Understand the properties of NFAs with epsilon transitions Understand concepts and definitions of regular expressions (REs) Understand the relationships among REs, FSAs, and regular languages 2

NFA with Epsilons A NFA with "-transitions is an NFA that may change states without reading an input symbol. a b c q " " 0 q 1 q 2 3

NFA with Epsilons A nondeterministic finite automaton is a 5-tuple M = hq,,,q 0,Fi where Q is a finite set of states is a finite alphabet : Q Q! 2 Q is the a transition transition function relation q 0 : Q2 Q [ {"}! 2 Q is the transition relation qf 0 2 Q is the start (initial) state F Q is the set of final (accept) states L(M) is the language of M, i.e. the set of strings M accepts 4

Definitions 5

Definitions Let CL " (q) ={p 2 Q p is reachable from q by "-moves} 5

Definitions Let CL " (q) ={p 2 Q p is reachable from q by "-moves} We can generalize this to a set P CL " [ (P )= CL " (p) p2p 5

Definitions Let CL " (q) ={p 2 Q p is reachable from q by "-moves} We can generalize this to a set P CL " [ (P )= CL " (p) p2p Generalized transition definition ˆ(q, ") =CL " (q) ˆ(q, x )=CL " ( (ˆ(q, x), )) 5

Definitions Let CL " (q) ={p 2 Q p is reachable from q by "-moves} We can generalize this to a set P CL " [ (P )= CL " (p) p2p Generalized transition definition ˆ(q, ") =CL " (q) ˆ(q, x )=CL " ( (ˆ(q, x), )) May be further generalized to sets Generalized definition is different than base 5

Definitions Let CL " (q) ={p 2 Q p is reachable from q by "-moves} We can generalize this to a set P CL " [ Formal (P )= definition CL " (p) of L(M) n o L(M) p2p = w 2 ˆ(q 0, w) \ F 6= ; Generalized transition definition ˆ(q, ") =CL " (q) ˆ(q, x )=CL " ( (ˆ(q, x), )) May be further generalized to sets Generalized definition is different than base 5

" NFA& -NFA Equivalence Theorem. For every NFA A with epsilon moves there is an equivalent NFA A 0 without, s.t. L(A) =L(A 0 ) 6

" NFA& -NFA Equivalence Theorem. For every NFA A with epsilon moves there is an equivalent NFA A 0 without, s.t. L(A) =L(A 0 ) Proof. This is a constructive proof. 6

" NFA& -NFA Equivalence Theorem. For every NFA A with epsilon moves there is an equivalent NFA A 0 without, s.t. L(A) =L(A 0 ) Proof. This is a constructive proof. Construction. Given A = hq,,,q 0,Fi We construct A 0 = hq,, 0,q 0,F 0 i 6

" NFA& -NFA Equivalence Theorem. For every NFA equivalent NFA A with epsilon moves there is an A 0 without, s.t. L(A) =L(A 0 ) Proof. This is a constructive proof. Construction. Given A = hq,,,q 0,Fi We construct A 0 = hq,, 0,q 0,F 0 i ( F 0 F [ {q 0 } if CL " (q 0 ) \ F 6= ; = F otherwise 6

" NFA& -NFA Equivalence Theorem. For every NFA equivalent NFA A with epsilon moves there is an A 0 without, s.t. L(A) =L(A 0 ) Proof. This is a constructive proof. Construction. Given A = hq,,,q 0,Fi We construct A 0 = hq,, 0,q 0,F 0 i ( F 0 F [ {q 0 } if CL " (q 0 ) \ F 6= ; = F otherwise Using the generalized transition definition, 6

" NFA& -NFA Equivalence Theorem. For every NFA equivalent NFA A with epsilon moves there is an A 0 without, s.t. L(A) =L(A 0 ) Proof. This is a constructive proof. Construction. Given A = hq,,,q 0,Fi We construct A 0 = hq,, 0,q 0,F 0 i ( F 0 F [ {q 0 } if CL " (q 0 ) \ F 6= ; = F otherwise Using the generalized transition definition, 0 (q, )=ˆ(q, ) 6

" NFA& -NFA Equivalence It remains to show: 0 (q 0, x) =ˆ(q 0, x) (i) base: x =1 0 (q, a) =ˆ(q, a) by definition of 0 7

Regular Expression A regular expression is a way of describing the languages accepted by FSAs. Defined recursively: 1. ; is an RE denoting the empty set 2. " is an RE denoting the set {"} 3. for each a 2, a is a RE denoting {a} 4. If r and s are REs denoting the languages R and S (r s) (rs) r* denotes denotes denotes R [ S R.S R Precedence means parentheses can sometimes be omitted: *. 8

Examples (0 1)* 0* 1* denotes all finite words over = {0, 1} denotes all finite words containing only 0 s and 1 s 9

REs and "-NFAs Theorem. For every RE L(r) =L(A) r there is an "-NFA s.t. 10

REs and "-NFAs Theorem. For every RE L(r) =L(A) r there is an "-NFA s.t. Proof. We will construct A compositionally using induction on the number of operators in r. 10

REs and "-NFAs Theorem. For every RE L(r) =L(A) r there is an "-NFA s.t. Proof. We will construct A compositionally using induction on the number of operators in r. Base cases. r has 0 operators 10

REs and "-NFAs Theorem. For every RE L(r) =L(A) r there is an "-NFA s.t. Proof. We will construct A compositionally using induction on the number of operators in r. Base cases. r has 0 operators r = ; q 0 q f 10

REs and "-NFAs Theorem. For every RE L(r) =L(A) r there is an "-NFA s.t. Proof. We will construct A compositionally using induction on the number of operators in r. Base cases. r has 0 operators r = ; q 0 q f r = " q " 0 q f 10

REs and "-NFAs Theorem. For every RE L(r) =L(A) r there is an "-NFA s.t. Proof. We will construct A compositionally using induction on the number of operators in r. Base cases. r has 0 operators r = ; q 0 q f r = " q " 0 q f r = a q a 0 q f 10

REs and "-NFAs Theorem. For every RE L(r) =L(A) r there is an "-NFA s.t. Proof. We will construct A compositionally using induction on the number of operators in r. Base cases. r has 0 operators r = ; q 0 q f Note: we assume there is exactly one final state. r = " q " 0 q f r = a q a 0 q f 10

REs and "-NFAs Inductive step. We assume hypothesis is true for all REs with n operations, and then prove is true for n+1 operations. 11

REs and "-NFAs Inductive step. We assume hypothesis is true for all REs with n operations, and then prove is true for n+1 operations. There are three cases to be dealt with: (1) (2) (3) r = r 1 r 2 r = r 1 r 2 r = r 1 * 11

Case 1: r = r 1 r 2 By the inductive hypothesis, there are two epsilon NFAs and. A 1 A 2 A 1 q 01 q f1 q 02 q f2 12

Case 1: r = r 1 r 2 By the inductive hypothesis, there are two epsilon NFAs and. A 1 A 2 A 1 q 01 q f1 A 2 q 02 q f2 13

Case 1: r = r 1 r 2 By the inductive hypothesis, there are two epsilon NFAs and. A 1 A 2 Construct the following A. " q 01 q f1 " q 0 q f " " q 02 q f2 14

Case 1: r = r 1 r 2 Formally, if A 1 = hq 1,, 1,q 01, {q f1 }i then, A 2 = hq 2,, 2,q 02, {q f2 }i A = hq 1 [ Q 2 [ {q 0 } [ {q f },,,q 0, {q f }i (q 0, ") ={q 01,q 02 } (q f1, ") ={q f } (q f2, ") ={q f } (q, )= 1 q, 8q 2 Q 1 {q f1 }, 2 [ {"} (q, )= 2 q, 8q 2 Q 2 {q f2 }, 2 [ {"} 15

Case 1: r = r 1 r 2 It remains to show that L(A) =L(A 1 ) [ L(A 2 ) How to do this? Set containment. 16

Cases 2 & 3 Strategy for showing this proceeds as with Case 1 Refer to textbook for details. 17