On the Incremental Inference of Context-Free Grammars from Positive Structural Information

Size: px

Start display at page:

Download "On the Incremental Inference of Context-Free Grammars from Positive Structural Information"

Belinda Briggs
5 years ago
Views:

1 International Journal of ISSN Systems and Technologies IJST Vol.1, No.2, pp KLEF 2008 On the Incremental Inference of Context-Free Grammars from Positive Structural Information Gend Lal Prajapati 1, Narendra S. Chaudhari 2 and Manohar Chandwani 3 1,3 Institute of Engineering & Technology, Department of Computer Engineering Devi Ahilya University Khandwa Road, Indore, PIN , M.P., India 1 gprajapati.iet@dauniv.ac.in, 3 chandwanim1@rediffmail.com 2 School of Computer Engineering Nanyang Technological University Block: N4-2a-32; 50, Nanyang Avenue, Singapore, PIN ASNarendra@ntu.edu.sg Abstract. The primary goal of this paper is to design so called incremental version of the effective model which learns context-free grammars from positive samples of their structural descriptions, where the input sample is fed to the learner in an online manner. This seems to be essential in the context of identification in the limit. Efficient pollynomially bounded updating inference algorithms are presented to achieve good incremental behavior in the inference algorithms mrt and mrc, demonstrated in the effective model. We also modify mrc to infer extended reversible context-free grammars from positive structural samples. Keywords: Context-free grammar (CFG); Extended reversible CFG; Grammatical inference; Reversible skeletal tree automata; Structural sample. 1 Introduction We consider the problem of inferring context-free languages incrementally from positive-only examples. The problem of designing inference algorithms for inferring a correct grammar for the unknown language from input data is at very heart of an important area of machine learning, called grammatical inference. It is known from Gold s negative result [1] on identifiability from positive presentation, that the class of CFGs cannot be identified in the limit from positive presentations. In search of an example presentation mechanism that compensates for the lack of explicit negative information in positive samples, learning CFGs in the framework of identification in the limit from positive samples of their structural descriptions thus becomes common in the literature, where a structural description of a CFG is an unlabelled derivation 15

2 On the incremental inference tree of the grammar. An example with some parenthesis inserted to indicate the shape of the derivation tree of a CFG, or equivalently an unlabelled derivation tree of the CFG is called a structural example or simply a skeleton. The tree automata based well-known efficient learning model for CFGs from positive structural information is early introduced by Sakakibara [4]. He has shown that there exists a class, called reversible CFGs, which can be identified in the limit from positive presentations of structural examples, that is, all and only unlabelled derivation trees of the unknown CFG, and shown that the reversible CFG is a normal form for CFGs, that is, reversible CFGs can generate all the context-free languages. We have improved the Sakakibara s work by producing an effective model [3] for the above problem. Main results include: the output grammar is consistent with the given examples, makes the task of bottom-up parsing easy, runs in O(n 3 ) time in the sum of the sizes of the input examples, achieves O(n 2 ) storage space saving over the closely related model [4] and infers a grammar from positive-only examples efficiently. However, like Sakakibara s model, it also cannot be used for an incremental learning, i.e., in the situation when the input sample is fed to the inference algorithm in an online manner, because they could not update a guess incrementally. In this paper, to solve above computational problem, we present polynomial time updating scheme to achieve good incremental behavior in our effective model [3]. In addition to this, we also demonstrate an inference scheme for extended reversible CFGs from positive samples of their structural descriptions. 2 Inference Algorithms Now we are going to present our inference scheme. We utilize all required definitions and notations that are relevant to our work from [2, 3]. We first present updating algorithms in order to prepare the inference algorithms mrt [3] for reversible tree automata from positive samples and mrc [3] for reversible CFGs from positive samples of their structural descriptions, to update a hypothesis when the training examples are supplied to the inference algorithm in online basis. Next we modify mrc to infer extended reversible CFGs from positive structural samples. A CFG G = (N,, P, S) is known to be extended reversible iff for P' = P { S a a }, G' = (N,, P', S) is reversible[4]. It can be easily shown (by using the Theorem 1 and Theorem 2 in [3]) that the inference algorithms sketched in this section have polynomial bounded update time. 2.1 Updating Algorithm mirt for Tree Automata 16

3 It is useful in the context of the identification in the limit to show that mrt may be modified to have good incremental behavior. That is, given the output A f = Bs(S + )/π' f computed by mrt on input S +, where π' f is the final partition found by mrt beginning with the trivial partition of the set of states of Bs(S + ) and given a new nonempty set of skeletons S ++, we may update A f to be the output computed by mrt on input S = S + S ++. The method for achieving this is illustrated in the algorithm mirt shown as Algorithm 1. Our main aim is to reduce the overheads needed to compute the partition already been found by mrt on input S +. This is guaranteed by the initializations of mirt before entering to the merging portion. Increase i by 1; Algorithm 1: Input: the output A'/π' f computed by mrt on input a nonempty positive sample S + beginning with the base tree automaton A' = Bs(S + ) = (Q', V, δ', F' ) and a new nonempty set of skeletons S ++, where π' f is the final partition of the set Q' of states of A' constructed by mrt; Output: a reversible skeletal tree automaton A such that A = mrt(s + S ++ ); Method: Let Bs(S ++ ) = (Q'', V, δ'', F'' ); Let A = (Q = Q' Q'', V, δ = δ' δ'', F = F' F'' ); Let π' 0 be the trivial partition of Q Q' ; Let π 0 = π' 0 π' f ; Let B f π' f be the block such that B f F' Ø; Let π 1 be π 0 with B f and all blocks B(q, π 0 ) such that q F F' merged; Let i = 1; do Let j = i; for all p of the form p = σ(u 1,, u k ) Q for all q of the form q = σ(u' 1,, u' k ) Q { p} if B(p, π i ) = B(q, π i ) then if u l, u' l Q and B(u l, π i ) B(u' l, π i ) for some l (1 l k) and B(u j, π i ) = B(u' j, π i ) or u j = u' j Σ for 1 j k and j l then Let π i+1 be π i with B(u l, π i ) and B(u' l, π i ) merged; Increase i by 1; fi /* End of if */ else if B(u j, π i ) = B(u' j, π i ) or u j = u' j Σ for 1 j k then Let π i+1 be π i with B(p, π i ) and B(q, π i ) merged; 17

4 On the incremental inference fi /* End of if */ fi /* End of if */ end for end for while j i; Let f = i and output the tree automaton A/π f. End of Algorithm. 2.2 Updating Algorithm mirc for Context-Free Grammars Further, in the context of the identification in the limit of reversible CFGs we need to modify the inference algorithm mrc to infer reversible CFGs incrementally from positive samples of their structural descriptions. The algorithm mirc for doing this is sketched in Algorithm 2. Algorithm 2: Input: the output A'/π' f computed by mrt on input a nonempty positive sample S + beginning with the base tree automaton A' = Bs(S + ) = (Q', V, δ', F' ) and a new nonempty set of positive structural examples S ++, where π' f is the final partition of the set Q' of states of A' constructed by mrt; Output: a reversible context-free grammar G such that G = mrc(s + S ++ ); Method: Run mirt on the tree automaton A'/π' f and the set S ++ ; Let G = G'(mIRT(A'/π' f, S ++ )) and output the grammar G. End of Algorithm. 2.3 Inferring Extended Reversible Grammars The inference algorithm mrc' for extended reversible CFGs from positive samples of their structural descriptions is described in Algorithm 3. This is a modification of the algorithm mrc. Algorithm 3: Input : a nonempty positive sample S + of structural descriptions; Output : an extended reversible context-free grammar G; Method : Let S' + = S + { σ(a) a }; Let Uni = S + { σ(a) a }; 18

5 Run mrc on the sample S' + and let G' = (N,, P, S) be mrc(s' + ); Let P' = {S a σ(a) Uni }; Let G = (N,, P P', S) and output the grammar G. End of Algorithm. Now we are going to present the updating algorithm mirc' to have good incremental behavior in mrc' for inferring extended reversible CFGs from positive samples of their structural descriptions. The algorithm mirc' is described in Algorithm 4. Algorithm 4: Input: the output A'/π' f computed by mrt on input a nonempty positive sample S + = S' + { σ(a) a } beginning with the base tree automaton A' = Bs(S + ) = (Q', V, δ', F' ) and a new nonempty set of positive structural examples S' ++, where π' f is the final partition of the set Q' of states of A' constructed by mrt ; Output: an extended reversible context-free grammar G such that G = mrc'(s + S ++ ); Method: Let S ++ = S' ++ { σ(a) a }; Let S = S' + S' ++ ; Let Uni = S { σ(a) a }; Run mirc on the automaton A'/π' f and the set S ++ ; Let G' = (N,, P, S) be mirc(a'/π' f, S ++ ); Let P' = {S a σ(a) Uni }; Let G = (N,, P P', S) and output the grammar G. End of Algorithm. 3 An Example As an example of run, suppose that the algorithm mrc is going to infer the following unknown CFG G U for a simple natural language: Sentence Noun_phrase Verb_phrase Noun_phrase Determiner Noun_phrase2 Noun_phrase2 Noun 19

6 On the incremental inference Noun_phrase2 Adjective Noun_phrase2 Verb_phrase Verb Noun_phrase Determiner the Determiner a Noun girl Noun cat Noun dog Adjective young Verb likes Verb chases. First suppose that the inference algorithm mrc is given the following structural sample S + : σ ( σ ( σ ( the ), σ ( σ ( girl ) ) ), σ ( σ ( likes ), σ ( σ ( a ), σ ( σ ( cat ) ) ) ) ) σ ( σ ( σ ( the ), σ ( σ ( girl ) ) ), σ ( σ ( likes ), σ ( σ ( a ), σ ( σ ( dog ) ) ) ) ). So, mrt first constructs the base tree automaton A = Bs(S + ) = (Q, V, δ, F) as follows: V 0 = {the, girl, likes, a, cat, dog}. Thus all elements of Q of the form q = σ(u 1,, u k ), where special symbol σ Sk, q Q, and u 1,..., u k Q V 0 are shown as follows. q 1 = σ(the) q 2 = σ(girl) q 3 = σ(q 2 ) q 4 = σ(q 1, q 3 ) q 5 = σ(likes) q 6 = σ(a) q 7 = σ(cat) q 8 = σ(q 7 ) q 9 = σ(q 6, q 8 ) q 10 = σ(q 5, q 9 ) q 11 = σ(q 4, q 10 ) F q 12 = σ(dog) 20

7 q 13 = σ(q 12 ) q 14 = σ(q 6, q 13 ) q 15 = σ(q 5, q 14 ) q 16 = σ(q 4, q 15 ) F. The trivial partition of the set Q: 0 = {[q 1 ], [q 2 ], [q 3 ], [q 4 ], [q 5 ], [q 6 ], [q 7 ], [q 8 ], [q 9 ], [q 10 ], [q 11 ], [q 12 ], [q 13 ], [q 14 ], [q 15 ], [q 16 ]}. Now the mrt finds the following final partition f of the set Q beginning with the trivial partition 0 of Q with the property that A/ f is reversible and outputs A/ f. f = { [q 1 ], [q 2 ], [q 3 ], [q 4 ], [q 5 ], [q 6 ], [q 7, q 12 ], [q 8, q 13 ], [q 9, q 14 ], [q 10, q 15 ], [q 11, q 16 ] }. Let we call NT 1 = [q 4 ], NT 2 = [q 10, q 15 ], NT 3 = [q 1 ], NT 4 = [q 3 ], NT 5 = [q 2 ], NT 6 = [q 5 ], NT 7 = [q 9, q 14 ], NT 8 = [q 6 ], NT 9 = [q 8, q 13 ], NT 10 = [q 7, q 12 ], and S = [q 11, q 16 ]. Then mrc outputs the reversible CFG: S NT 1 NT 2 NT 1 NT 3 NT 4 NT 2 NT 6 NT 7 NT 3 the NT 4 NT 5 NT 5 girl NT 6 likes NT 7 NT 8 NT 9 NT 8 a NT 9 NT 10 NT 10 cat NT 10 dog. Suppose that in the next stage the following examples are added to the sample, S ++ : σ ( σ ( σ ( a ), σ ( σ ( dog ) ) ), σ ( σ ( chases ), σ ( σ ( the ), σ ( σ ( girl ) ) ) ) ) σ ( σ ( σ ( a ), σ ( σ ( dog ) ) ), σ ( σ ( chases ), σ ( σ ( a ), σ ( σ ( cat ) ) ) ) ). 21

8 On the incremental inference Now the algorithm mirc updates the above grammar based on this information by applying the updating algorithm mirt. The mirt finds the reversible tree automaton corresponding to the examples received up to this stage without making overhead computation as shown in the following. On the addition of S ++, the following are new states of the form q = σ(u 1,, u k ): q 17 = σ(chases) q 18 = σ(q 17, q 4 ) q 19 = σ(q 14, q 18 ) (final state) q 20 = σ(q 17, q 9 ) q 21 = σ(q 14, q 20 ) (final state). The automaton A = (Q, V, δ, F) is: V 0 = {the, girl, likes, a, cat, dog, chases}. Q = {q 1, q 2, q 3, q 4, q 5, q 6, q 7, q 8, q 9, q 10, q 11, q 12, q 13, q 14, q 15, q 16, q 17, q 18, q 19, q 20, q 21 }. F = {q 11, q 16, q 19, q 21 }. The trivial partition of the new states: π' 0 = {[q 17 ], [q 18 ], [q 19 ], [q 20 ], [q 21 ] }. Therefore, π 0 = {[q 17 ], [q 18 ], [q 19 ], [q 20 ], [q 21 ]} {[q 1 ], [q 2 ], [q 3 ], [q 4 ], [q 5 ], [q 6 ], [q 7, q 12 ], [q 8, q 13 ], [q 9, q 14 ], [q 10, q 15 ], [q 11, q 16 ]} = {[q 1 ], [q 2 ], [q 3 ], [q 4 ], [q 5 ], [q 6 ], [q 7, q 12 ], [q 8, q 13 ], [q 9, q 14 ], [q 10, q 15 ], [q 11, q 16 ], [q 17 ], [q 18 ], [q 19 ], [q 20 ], [q 21 ]}. Finally, the initial partition π 1 to start merging passes is: π 1 = {[q 1 ], [q 2 ], [q 3 ], [q 4 ], [q 5 ], [q 6 ], [q 7, q 12 ], [q 8, q 13 ], [q 9, q 14 ], [q 10, q 15 ], [q 11, q 16, q 19, q 21 ], [q 17 ], [q 18 ], [q 20 ]}. Then the algorithm mirt beginning with the above partition π 1 obtains the following final partition f and outputs the reversible tree automaton A/ f. f = {[q 1 ], [q 2 ], [q 3 ], [q 4, q 9, q 14 ], [q 5, q 17 ], [q 6 ], [q 7, q 12 ], [q 8, q 13 ], [q 10, q 15, q 18, q 20 ], [q 11, q 16, q 19, q 21 ]}. 22

9 Next, mirc outputs the following reversible CFG: S NT 1 NT 2 NT 1 NT 3 NT 4 NT 1 NT 6 NT 7 NT 2 NT 9 NT 1 NT 3 the NT 4 NT 5 NT 5 girl NT 6 a NT 7 NT 8 NT 8 cat NT 8 dog NT 9 likes NT 9 chases. Here, NT 1 = [q 4, q 9, q 14 ], NT 2 = [q 10, q 15, q 18, q 20 ], NT 3 = [q 1 ], NT 4 = [q 3 ], NT 5 = [q 2 ], NT 6 = [q 6 ], NT 7 = [q 8, q 13 ], NT 8 = [q 7, q 12 ], NT 9 = [q 5, q 17 ], and S = [q 11, q 16, q 19, q 21 ]. Suppose that in the further stage of the inference process the following examples are added to the sample: σ ( σ ( σ ( a ), σ ( σ ( dog ) ) ), σ ( σ ( chases ), σ ( σ ( a ), σ ( σ ( girl ) ) ) ) ) σ ( σ ( σ ( the ), σ ( σ ( dog ) ) ), σ ( σ ( chases ), σ ( σ ( a ), σ ( σ ( young ), σ ( σ ( girl ) ) ) ) ) ). For this instance the new states are of the form q = σ(u 1,, u k ): q 22 = σ(q 6, q 3 ) q 23 = σ(q 17, q 22 ) q 24 = σ(q 14, q 23 ) (final state) q 25 = σ(q 1, q 13 ) q 26 = σ(young) q 27 = σ(q 26, q 3 ) 23

10 On the incremental inference q 28 = σ(q 6, q 27 ) q 29 = σ(q 17, q 28 ) q 30 = σ(q 25, q 29 ) (final state). The automaton A = (Q, V, δ, F) is: V 0 = {the, girl, likes, a, cat, dog, chases, young}. Q = {q 1, q 2, q 3, q 4, q 5, q 6, q 7, q 8, q 9, q 10, q 11, q 12, q 13, q 14, q 15, q 16, q 17, q 18, q 19, q 20, q 21, q 22, q 23, q 24, q 25, q 26, q 27, q 28, q 29, q 30 }. F = {q 11, q 16, q 19, q 21, q 24, q 30 }. Setting initial partition to go on merging: π 1 = {[q 1 ], [q 2 ], [q 3 ], [q 4, q 9, q 14 ], [q 5, q 17 ], [q 6 ], [q 7, q 12 ], [q 8, q 13 ], [q 10, q 15, q 18, q 20 ], [q 11, q 16, q 19, q 21, q 24, q 30 ], [q 22 ], [q 23 ], [q 25 ], [q 26 ], [q 27 ], [q 28 ], [q 29 ]}. Now, mirt starts updating the above partition by employing the merging process. Finally, it finds the following partition f and outputs the reversible tree automaton A/ f. f = {[q 1, q 6 ], [q 2, q 7, q 12 ], [q 3, q 8, q 13, q 27 ], [q 4, q 9, q 14, q 22, q 25, q 28 ], [q 5, q 17 ], [q 10, q 15, q 18, q 20, q 23, q 29 ], [q 11, q 16, q 19, q 21, q 24, q 30 ], [q 26 ]}. Let we denote NT 1 = [q 4, q 9, q 14, q 22, q 25, q 28 ], NT 2 = [q 10, q 15, q 18, q 20, q 23, q 29 ], NT 3 = [q 1, q 6 ], NT 4 = [q 3, q 8, q 13, q 27 ], NT 5 = [q 2, q 7, q 12 ], NT 6 = [q 26 ], NT 7 = [q 5, q 17 ], and S = [q 11, q 16, q 19, q 21, q 24, q 30 ]. The algorithm mirc outputs the following reversible CFG: S NT 1 NT 2 NT 1 NT 3 NT 4 NT 4 NT 5 NT 4 NT 6 NT 4 NT 2 NT 7 NT 1 NT 3 the NT 3 a NT 5 girl NT 5 cat NT 5 dog 24

11 NT 6 young NT 7 likes NT 7 chases. This grammar is isomorphic to the unknown grammar G U. 4 Conclusion We have presented efficient polynomially bounded algorithms for updating a guess in the inference of CFGs from positive samples of their structural descriptions. These algorithms are of particular interest because it is impractical to start the inferring process from the beginning every time new examples are added in online manner. Our scheme guarantees to not make overhead computation to update a grammar on receiving new examples. The computation saving has been verified with a concrete example. References [1] Gold, E.M., Language Identification in the Limit, Information and Control, vol. 10, pp , [2] Martin, J.C., Introduction to Languages and the Theory of Computation, Tata McGraw-Hill, [3] Prajapati, G.L., Chaudhari, N.S., and Chandwani, M., An Effective Model for Context-Free Grammar Inference, in: Prasad, B. (Ed.), Proceedings of the 3 rd Indian International Conference on Artificial Intelligence (IICAI-07), Pune, India, pp , [4] Sakakibara, Y., Efficient Learning of Context-Free Grammars from Positive Structural Examples, Information and Computation, vol. 97, pp ,

Learning cover context-free grammars from structural data

Learning cover context-free grammars from structural data Mircea Marin Gabriel Istrate West University of Timişoara, Romania 11th International Colloquium on Theoretical Aspects of Computing ICTAC 2014