(1) [John (i) ] attacked [Bob (j) ]. Police arrested him (i).

Size: px

Start display at page:

Download "(1) [John (i) ] attacked [Bob (j) ]. Police arrested him (i)."

Kathlyn Bradley
6 years ago
Views:

1 1,a) 1,b) 1,c) 1,d) 1,e) e.g., man arrest e.g., man arrest man [Van de Crus 2014]. 1. [1], [2], [3] [4] [5] [6], [7], [8], [9] [10], [11] 1 Tohoku Universit a) masauki.ono@ecei.tohoku.ac.jp b) naoa-i@ecei.tohoku.ac.jp c) -matsu@ecei.tohoku.ac.jp d) okazaki@ecei.tohoku.ac.jp e) inui@ecei.tohoku.ac.jp (1) [John (i) ] attacked [Bob (j) ]. Police arrested him (i). John (i) Bob (j) attack attack arrest John him (1) (2) Police arrested [John]. (3) Police arrested [Bob]. c 2016 Information Processing Societ of Japan 1

2 John Bob arrest (4) Police arrested [John, who attacked Bob]. (5) Police arrested [Bob, whom John attacked]. Van de Crus [5] Resnik [4] WordNet [12] snset Rooth [6] EM Rooth Latent Dirichlet Allocation (LDA) [7], [8] Kawahara [9], [13] Chinese Restaurant Process (CRP) Erk [10], [11] Van de Crus [5] Mikolov [14] Van de Crus police arrest criminal subj 1 obj s(police) v(arrest) o(criminal) police arrest criminal SVO Ritter Van de Crus [5], [8], [15] 2 1 the police arrested him him John Bob Van de Crus Narrative Chain [16] X bankrupt acquire X [17], [18] [2], [3], [19] Modi [20] c 2016 Information Processing Societ of Japan 2

3 > [John (i) ] attacked [Bob (j) ]. 共参照 ( 正解 ) Police arrested him (i). [John (i) ] attacked [Bob (j) ]. 共参照 ( 不正解 ) Police arrested him (j). s(police) v(arrest) o(criminal) police arrest criminal s(police) v(arrest) o(tomato) police arrest tomato 2 SVO w o =criminal w o =tomato 3. Van de Crus SVO SVO police arrest [John] s(police) v(arrest) o(john) police arrest John police arrest [John, who attacked Bob] police arrest [Bob] s(police) v(arrest) o(bob) police arrest Bob police arrest [Bob, whom John attacked] Van de Crus SVO SVO SVO SVO 1 w s w v w o s(w s ), v(w v ), o(w o ) R d * 1 (police, arrest, criminal) 1 R h g[(w s, w v, w o )] (1), (2) g[(w s, w v, w o )] = (w s, w v, w o ) (1) (w s, w v, w o ) = f( (s(w s ) v(w v ) o(w o )) + b 1 ) (2) (concat) R h 3d, R 1 h b 1 f( ) tanh g[(w s, w v, w o )] SVO Collobert [21] 2 (,, ) (1) (2) (3) 2 w o w o (1) (3) s, v, o, b 1 *1 Van de Crus w s(w), v(w), o(w) s(police) v(arrest) s(police) v(arrest) police arrest W subj police arrest W obj s(john) v(attack) o(bob) s(john) v(attack) o(bob) John attack Bob John attack Bob 3 max(0, 1 g[(w s, w v, w o )] + g[( w s, w v, w o )]) +max(0, 1 g[(w s, w v, w o )] + g[(w s, w v, w o )]) +max(0, 1 g[(w s, w v, w o )] + g[( w s, w v, w o )]) (3) w s w o, (1) SVO (6) [John (i) ] attacked [Bob (j) ]. Police arrested him (i). (7) Police arrested [John, who attacked Bob]. (8) Police arrested [Bob, whom John attacked]., A of B [2], [3], [19] (,, ) - c 2016 Information Processing Societ of Japan 3

4 (7) John (John, arrest, Bob) John (8) Bob (John, arrest, Bob) Bob 3 3 Van de Crus [5] SVO [14],? a c a = c a,s, c a,v, c a,o c a c a,s c a,v c a,o c a a c a,s = a c a,o = a a c a,c {subj, obj} (9) A man attacked a woman. (...) A police arrested him. arrest a=man man c a A man attacked a woman 1 c a,s = man, c a,v = attack, c a,o = woman, c a,c = subj SVO s(w s ), o(w o ) s cmp (w s, c s ), o cmp (w o, c o ) g[(w s, w v, w o, c s, c o )] = (w s, w v, w o, c s, c o ) (4) (w s, w v, w o, c s, c o ) = f( (s cmp (w s, c s ) v(w v ) o cmp (w o, c o )) + b 1 ) (5) c a,c a man attacks a woman a woman attacks a man man s cmp s cmp (w s, c s ) = f(w subj (s(c s,s ) v(c s,v ) o(c s,o ))) f(w obj (s(c s,s ) v(c s,v ) o(c s,o ))) s(w s ) ifc s,c = subj; ifc s,c = obj; otherwise; c s,v c s,s, c s,o subj a man attacks W subj obj attack a man W obj a man attacked a woman a woman attacked a man o cmp o cmp (w o, c o ) = f(w subj (s(c o,s ) v(c o,v ) o(c o,o ))) f(w obj (s(c o,s ) v(c o,v ) o(c o,o ))) o(w o ) (6) ifc o,c = subj; ifc o,c = obj; otherwise; John attacked Bob John who attacked Bob Ji Recursive Neural Network (RNN) Entit Vector [22] Entit Vector RNN Ji Entit Vector Entit Vector Entit Vector (7) c 2016 Information Processing Societ of Japan 4

5 4.1 SVO s, v, o, b 1 W subj, W obj SVO - - tanh (-1,1) SVO s, v, o tanh Tpe A:,, Tpe B1:,,, Tpe B2:,,, (10) [The old man (i) ] attacked [a bo (j) ]. A policeman arrested the man (i). the man (i) The old man (i) Tpe A policeman, arrest, man Tpe B2 policeman, arrest, man, man, attack, bo, subj Web ClueWeb12 * Stanford CoreNLP [23] *3 Tpe A, Tpe B1, Tpe B2 3 (1) (2) % (1) ,177,864 1,282,994 *2 *3 (2) (1) 1,140,266 (1) 3 (10) Tpe A policeman, arrest, man policeman, arrest, book Tpe B2 policeman, arrest, man, man, attack, bo, subj policeman, arrest, banana, monke, eat, banana, obj (2) Tpe A 3 Tpe B1, Tpe B2 (10) policeman, arrest, man, man, attack, bo, subj policeman, arrest, man, man, attack, bo, obj 3 2 (8) log σ(g[(w s, w v, w o )]) log(1 σ(g[( w s, w v, w o )])) log(1 σ(g[(w s, w v, w o )])) log(1 σ(g[( w s, w v, w o )])) σ w s, w o 5. [2], [3], [19] 5.1 (8) c 2016 Information Processing Societ of Japan 5

6 Mean Reciprocal Rank (MRR) MRR = 1 Q 1 (9) Q rank i=1 i Q rank(i) i (4) pseudodisambiguation test pseudo-disambiguation test,, CEAF, BLANC [24], [25], etc. OntoNotes 4.0 [26] OntoNotes Web (11) In his (p) 40-minute speech (i), Chen (j) declared the determination (k) of the people (l) of Taiwan (m) to continue forward... ( )... Chen (n) visited...( ), and he (p ) stated... (ectb 1025 ) he p his p (1) 3 (2) (3) he (p ) his (p) i, j, k, l, m 1 MRR SVO Entit Vector Entit Vector Entit Vector * 4 Entit Vector Chen (j) Chen (n) MRR rank Chen (j) c j = Chen, declared, determination, subj 1 c j = ϕ OntoNotes 4, d = AdaGrad [27] 1,000 20,, 3 Van de Crus [5] SVO 4 (1) 2 (2) 2, (3) 2, Entit Vector MRR 1 1 SVO *4 *4 Wilcoxon c 2016 Information Processing Societ of Japan 6

7 SVO % 25.00% 50.00% 75.00% % 4 people eat book eat people eat food, /2 MRR * % 100% *5 SVO 26,500,536 7,323,250 MRR (0.3866) 2 MRR baseline (SVO ) in restaurant incorrectl (12) All these were the highest levels in histor. Since being put into operation five ears ago, the Tianjin Port Bonded Area (i) has completed the construction of China s first goods distribution center, functions like a customs port, opened up the special use the railwa line from... Moreover, the bonded area (j) has implemented a series of preferential policies towards enterprises entering the area (k) : it (pro) has established a sstem of no custom accounting and established manuals and management... (chtb 0099) it (pro) the Tianjin Port Bonded Area (i), the bonded area (j), the area (k) 3 SVO SVO it (pro) establish area X (X which completed construction) i X (X which implemented series) j establish c 2016 Information Processing Societ of Japan 7

8 6. 1 recurrent JSPS 15H01702, 15H05318, 15K16045 JST, CREST [1] Bergsma, S.: Discriminative Learning of Selectional Preference from Unlabeled Text, EMNLP, pp (online), available from (2008). [2] Rahman, A. and Ng, V.: Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge, Proceedings of EMNLP-CoNLL, pp (2012). [3] Peng, H., Khashabi, D. and Roth, D.: Solving Hard Coreference Problems, NAACL, pp (2015). [4] Resnik, P.: Selectional constraints: An informationtheoretic model and its computational realization, Cognition, Vol. 61, No. 1, pp (1996). [5] Van de Crus, T.: A neural network approach to selectional preference acquisition, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp (2014). [6] Rooth, M., Riezler, S., Prescher, D., Carroll, G. and Beil, F.: Inducing a Semanticall Annotated Lexicon via EM- Based Clustering, Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, College Park, Marland, USA, Association for Computational Linguistics, pp (1999). [7] Séaghdha, D. O.: Latent variable models of selectional preference, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp (2010). [8] Ritter, A., Etzioni, O. et al.: A latent dirichlet allocation method for selectional preferences, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp (2010). [9] Kawahara, D., Peterson, D. W. and Palmer, M.: A Stepwise Usage-based Method for Inducing Polsem-aware Verb Classes, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp (2014). [10] Erk, K.: A simple, similarit-based model for selectional preferences, ACL, Vol. 45, No. 1, p. 216 (2007). [11] Erk, K., Padó, S. and Padó, U.: A flexible, corpusdriven model of regular and inverse selectional preferences, Computational Linguistics, Vol. 36, No. 4, pp (2010). [12] Fellbaum, C.: WordNet: An Electronic Lexical Database, Bradford Books (1998). [13] Kawahara, D. and Kurohashi, S.: Case frame compilation from the web using high-performance computing, NAACL, pp. 1 7 (2006). [14] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. and Dean, J.: Distributed representations of words and phrases and their compositionalit, Advances in neural information processing sstems, pp (2013). [15] Van de Crus, T., Poibeau, T. and Korhonen, A.: A tensor-based factorization model of semantic compositionalit, Conference of the North American Chapter of the Association of Computational Linguistics (HTL- NAACL), pp (2013). [16] Chambers, N. and Jurafsk, D.: Unsupervised Learning of Narrative Schemas and their Participants, ACL, pp (2009). [17] Lin, D. and Pantel, P.: DIRT: discover of inference rules from text, KDD 01: Proceedings of the seventh ACM SIGKDD international conference, pp (2001). [18] J. Berant, T. A. and Goldberger, J.: Global Learning of Tped Entailment Rules, ACL, pp (2008). [19] Inoue, N., Ovchinnikova, E., Inui, K. and Hobbs, J.: Coreference Resolution with ILP-based Weighted Abduction, COLING, pp (2012). [20] Modi, A. and Titov, I.: Inducing neural models of script knowledge, CoNLL-2014, p. 49 (2014). [21] Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K. and Kuksa, P.: Natural language processing (almost) from scratch, The Journal of Machine Learning Research, Vol. 12, pp (2011). [22] Ji, Y. and Eisenstein, J.: One Vector is Not Enough: Entit-Augmented Distributed Semantics for Discourse Relations (2015). [23] Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J. and McClosk, D.: The Stanford CoreNLP Natural Language Processing Toolkit, Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: Sstem Demonstrations, pp (2014). [24] Luo, X.: On Coreference Resolution Performance Metrics, HLT/EMNLP, pp (2005). [25] Recasens, M. and Hov, E. H.: BLANC: Implementing the Rand Index for Coreference Evaluation, Journal of Natural Language Engineering (2010). [26] Hov, E., Marcus, M., Palmer, M., Ramshaw, L. and c 2016 Information Processing Societ of Japan 8

9 Weischedel, R.: OntoNotes: the 90% solution, Proceedings of the human language technolog conference of the NAACL, Companion Volume: Short Papers, Association for Computational Linguistics, pp (2006). [27] Duchi, J., Hazan, E. and Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization, The Journal of Machine Learning Research, Vol. 12, pp (2011). c 2016 Information Processing Societ of Japan 9

a) b) (Natural Language Processing; NLP) (Deep Learning) Bag of words White House RGB [1] IBM

a) b) (Natural Language Processing; NLP) (Deep Learning) Bag of words White House RGB [1] IBM c 1. (Natural Language Processing; NLP) (Deep Learning) RGB IBM 135 8511 5 6 52 yutat@jp.ibm.com a) b) 2. 1 0 2 1 Bag of words White House 2 [1] 2015 4 Copyright c by ORSJ. Unauthorized reproduction of