Indexed Languages and why you care Presented by Pieter Hooimeijer 2008-02-07 1
Think Way Back... www.xkcd.com 2
...Far Enough:... CS NCF DCF REG REG 3
Formal Language Classes... CS NCF DCF REG REG PDA 4
Formal Language Classes... CS NCF PDA DCF REG REG DFA 5
Formal Language Classes... CS NCF Machines PDA Formalism S asb z DCF REG REG DFA a(ab)*b 6
Formal Language Classes... CS NCF Pump DCF REG REG Pump 7
The Context-Free Pumping Lemma Suppose L = { anbncn n = 1...} is contextfree. By the pumping lemma, any string s with s p can be 'pumped.' p 8
The Context-Free Pumping Lemma What does 'can be pumped' mean? s = uvxyz 1) vy >0 2) vxy p 3) uvixyiz is in L for all i 0 9
The Context-Free Pumping Lemma Suppose L = { anbncn n = 1...} is context-free. Consider s = apbpcp; for any split s = uvxyz, we have: - if v contains a's, then y cannot contain c's - if y contains c's, then v cannot contain a's - a problem: uv0xy0z = uxz will never be in L 10
Moving Right Along 11
Motivation Some Examples Can solve problems by phrasing them as 'language' problems: Finding valid control flow graph paths Solving set constraint problems Static string analysis Lexing, parsing For fun... 12
Example String Variables www.xkcd.com 13
The Nugget Some Code: x = 'z'; while(n < 5) { x = '('. x. ')'; n ++; } We want a context free grammar to model x Suppose we don't know anything about n 14
The Nugget Some Code: >x = 'z'; Grammar: A -> z while(n < 5) { x = '('. x. ')'; n ++; } 15
The Nugget Some Code: x = 'z'; while(n < 5) { > x = '('. x. ')'; n ++; } Grammar: A -> z B -> (A) [True] [n < 5] 16
The Nugget Some Code: x = 'z'; while(n < 5) { x = '('. x. ')'; n ++; >>} Grammar: A -> z B -> (A) C -> A B [True] [n < 5] [n < 5] 17
The Nugget Some Code: x = 'z'; while(n < 5) { x = '('. x. ')'; n ++; >>} Grammar: A -> z B -> (C) C -> A B [True] [n < 5] [True] 18
The Nugget Some Code: x = 'z'; while(n < 5) { x = '('. x. ')'; n ++; >>} Grammar: X -> C A -> z B -> (C) C -> A B [True] [n < 5] [True] 19
Indexed Languages... CS *I* NCF DCF REG REG 20
Definition: Indexed Grammar G = (N, T, F, P, S) P : { N ((NF*) T)* } F : { { N (N T)*}f } 21
Derivation Rule Example 1 S Sf S ABgC S Sf Sff Aff Bgff Cff 22
Derivation Rule Example 1 S Sf S ABgC 'normal' production: copy index stack S Sf Sff Aff Bgff Cff 23
Derivation Rule Example 2 S Af F = { { A B}f } S Af B 24
Derivation Rule Example 2 S Af F = { { A B}f } S Af B 25
Derivation Rule Example 2 S Af F = { { A B}f } 'index' production: pop leftmost index (for current nonterminal only) S Af B 26
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} f = {A a B b C c} 27
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} f = {A a B b C c} What is the language of this grammar? 28
An Example Grammar S Df D Dg ABC S g = {A Aa B Bb C Cc} f = {A a B b C c} 29
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} S Df f = {A a B b C c} 30
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} S Df Dgf f = {A a B b C c} 31
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} S Df Dgf Agf Bgf Cgf f = {A a B b C c} 32
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} S Df Dgf Agf Bgf Cgf f = {A a B b C c} 33
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} f = {A a B b C c} S Df Dgf Agf Bgf Afa Cgf 34
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} f = {A a B b C c} S Df Dgf Agf Bgf Afa a Cgf aa 35
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} f = {A a B b C c} S Df Dgf Agf Bgf Afa Bfb a b aa bb Cgf 36
An Example Grammar S Df D Dg ABC g = {A Aa B Bb C Cc} f = {A a B b C c} S Df Dgf Agf Bgf Cgf Afa Bfb Cfc c a b aa bb cc 37
Where am I? { anbncn n = 1...}... CS *I* NCF DCF REG REG 38
... CS *I* DCF NCF Machines PDA Formalisms S asb z Pumps REG REG DFA a(ab)*b 39
... CS *I* DCF NCF NSA PDA S asfb z S asb z? REG REG DFA a(ab)*b 40
Associated Automaton 41
Associated Automaton 42
(Actual NSA) [Gilman, Shapiro 1998] 43
But will it pump? It's complicated on derivation trees rather than strings [Hayashi 1973] Several corollaries exist [Gilman 1996] (less general; easier to use) 44
The Lemma L : indexed language m : positive integer There is a constant k > 0 so that each w in L can be split (w = w1w2...wr) subject to:
The Lemma L : indexed language m : positive integer There is a constant k > 0 so that each w in L can be split (w = w1w2...wr) subject to: m<r k each wi > 0
The Lemma L : indexed language m : positive integer There is a constant k > 0 so that each w in L can be split (w = w1w2...wr) subject to: m<r k each wi > 0 any m-sized set of wi's is a subset of some w' in L; w' is a subproduct of w
Lemma Example
Lemma Example
(Montage Time) 50
{ anbncn n = 1...} { (abn)n n = 1...}... CS *I* DCF NCF NSA PDA S asfb z S asb z? REG REG DFA a(ab)*b 51
52
References Indexed Grammars An Extension to Context-Free Grammars (Aho) Nested Stack Automata (Aho) Sequentially Indexed Grammars (van Eijck) A Shrinking Lemma for Indexed Languages (Gilman) On Groups Whose Word Problem is Solved by a Nested Stack Automaton (Gilman and Shapiro) On Derivation Trees of Indexed Grammars (Hayashi) 53