The Research on Syntactic Features in Semantic Role Labeling

23 6 2009 11 J OU RNAL OF CH IN ESE IN FORMA TION PROCESSIN G Vol. 23, No. 6 Nov., 2009 : 100320077 (2009) 0620011208,,,, (, 215006) :,,,( NULL ),,,; CoNLL22005 Shared Task WSJ 77. 54 %78. 75 %F1, : ;;;; ; : TP391 : A The Research on Syntactic Features in Semantic Role Labeling L I J unhui, WAN G Hongling, ZHOU Guodong, ZHU Qiaoming, QIAN Peide ( School of Computer Science & Technology, Soochow University, Suzhou, Jiangsu 215006, China) Abstract : A feature2based semantic role labeling system operated on signal syntactic parse is constructed. The sys2 tem is divided into three sequential tasks : (1) filtering out constituent s that represent no semantic argument s with high probabilities, (2) classifying constituent s of candidate semantic argument s into the specific categories (inclu2 ding NULL class), and (3) dealing with overlap argument s and constituent s all labeled as core2argument s in the post2processing step. Besides combining and optimizing the existing features presented in other work, the paper ex2 tract s new features according to knowledge of grammar, pattern and collocation. The experiment s show the effec2 tiveness and robustness of the new extracted features, with which the finally SRL system achieves F1 value 77. 54 % and 78. 75 % on the development and WSJ test set respectively. As far as we know, it is the best result based on sin2 gle syntactic parsers on the CoNLL22005 Shared Task. Key words : artificial intelligence ; natural language processing ; semantic role labeling ; grammar2driven feature ; pat2 tern feature ; collocation feature 1,,, ( Argent ) ( Patient ) ( In2 (Semantic Role Labeling, SRL),, CoNLL 2004 [1 ] 2005 [ 2 ] SRL,SRL, strument),,(locative) : 2008211211 : 2009201215 : 863 (2006AA01Z147) ; (60673041) ; (08 KJD520010) : (1983 ),,, ; (1975 ),,, ; (1967 ),,,,

12 2009 ( Temporal) (Manner) ( Cause) 1 PropBank,, Arg0 ; Arg1 ;ArgM2LOC, 1 YaoMing plays basketball in NBA plays CoNLL,,SRL, ;, ( Argument Identification), (Argument Classification),, ( ) (),,, ;,,,;, CoN2 LL2005,, WSJ F1 77. 54 %78. 75 %, : ; Baseline ;, ;;,, [ 3 ], 7,( Constit uent Type) ( Subcategorization) ( Parse Tree Pat h) ( Constit uent Posi2 tion) ( Predicate Voice ) (Constituent Head Word) ( Predicate) 7 SRL [4 ],7,, ( Head POS) PP ( First Word in Constit2 uent) [ 5 ]7, 5,, [ 6 ] PA K ( Predicate/ Argument Structure Kernel) [ 7 ] PA K, ( Pat h Kernel) (Constituent Struct ure Kernel), [8 ],, buy a car buy a red car high degree higher de2 gree,,,,, ;,,,,,, 3 Baseline 2 3. 1,: 1)

6 : 13, (NULL vs. NON2NULL),( Null ) ( = 0. 9), ;2), NULL ;3),,,, P (NULL ) >, ; (NULL ),, ;,, SVMLight,one vs. ot hers ( NULL ) (4 ) + + + + + + + 2 ( CBaseFeature) (15 ) 7 [4, 12214 ] [4, 12214 ] [4, 13214 ] 3. 2 [4, 13214 ], 0,, [12 ] N P S V P VB ( [14 ], ) N P,;N P, N P S V P VB N P, N P N P,,, SRL, 1 2,Baseline 1 ( IBaseFeature), [4, 12214 ] [425, 12214], Collins (6 ) V P, V P2 > VBZ_NP_PP,,, 1, N P ( Yaoming) play N P S V P VBD N P, SBAR [14 ] (12 ) + + + + + + + + + + + + + + + + 3. 3,, C1 C2 ( 1 N P (NBA) PP (in NBA) ),,,,, NULL The OpenNL P Maximum Entropy Package. http :/ / max2 ent. sourceforge. net/, PP PP +,PP + in ; + PP + NBA

14 2009,, SRL,,, : C1 C2, C1 A B,C2 A C p (A C1) p (C C2) > p (B C1) p (A C2) C1 A, C2 C,,C1 B,C2 A S N P + V P, S A,B C,,SBAR (SBAR (t hat) SBAR ( since) ) ;c. A A CC A, (A, CC,and, or ), A ; d. S 2 S,2 (a), ;2 (b), 2 (c), S, S S, : S N P +,S ( TO + ) V P +,S 4 SRL Baseline, : 1) SRL,,,,;,, ;2),, ;,Baseline, 4. 1 VBN ;b. SBAR SBA R,, 2 S 2) V P, 1 N P (basketball), ( 1),, 3 (a), buying Big in2 vestment banks (ref used, step support) (by) N P (Big in2, vestment banks) buy, N P S V P S V P V P S V P V P PP, S V P VB G, 1) N P buy, : a. ref used step support buying,,, N P (Big, 1 YaoMing invest ment banks) ;, V P (buy2 played basket ball in NBA,play ing... ) V P (ref used to.. ) ( 3 (b), ),N P buy VB VBZ VBD VBP G VBN (,,ref use step have ), support,,

6 : 15,, 3 N P(Big investment banks) VB G(buying) 4. 2, ( Pattern), Sbody benefit Sbody St hing,sbody benefit St hing f rom Sbody/ St hing benefit,,, ( A0, A1 ) open N P1 open N P2 open N P3, open, N P1,the store finally opened ;N P2 N P3,I opened t he box, go come take,,,,,, 4, 4 (a) N P ( I) 2 = N P V (A) N P,,,, go t hrough come up with take place take over 4 ( b) N P ( The, 4,,( Collocation), (4 ), : ( ) 4 (a) 4 NP(I) 1 = NP V (A) NP, 1,(A),V event) 3 = N P V ( A ) place 5 5. 1 CoNLL2005

16 2009, PropBank Brown : PropBank [9 ] Section02221,39 832 ; Section24,1 346 ;Section23,2 416 ;, Brown 426,, CoNLL2005 [ 10211 ], 10. 08 % 11. 89 % [ 14 ] Char2 niak Collins, Charniak, srl2eva. pl, F1, 3, 100, SVMLight c = 0. 131 6 e = 0. 01 m = 100 000, 5. 2, 3,P ( NULL ) >(, 0. 9),; 3 CoNLL 2005 Shared Task TestWSJ (= 0. 9) Precision Recall F1 IBaseFeature 77. 75 86. 02 81. 68 IBaseFeature + 76. 66 92. 75 83. 94 IBaseFeature + 74. 14 92. 20 82. 19 IBaseFeature + Both 76. 27 92. 75 83. 76 3,, S SBAR,,;,,, 3,,, ( Precision 76. 66 %76. 27 %), ( IBaseFeat ure + ) : 1), ;2) IBaseFeat ure, NULL,( IBaseFeature + ) 5. 3 SRL,, P (NULL) >, ; ( NULL ) 4 CoNLL 2005 Shared Task Test WSJ, 4 CoNLL 2005 Shared Task TestWSJ SRL,= 0. 9, IBaseFeature Precision Recall F1 CBaseFeature 80. 67 75. 21 77. 85 CBaseFeature + 80. 99 75. 68 78. 24 CBaseFeature + 80. 64 75. 53 78. 00 CBaseFeature + 80. 86 75. 46 78. 07 CBaseFeature + 80. 68 75. 36 77. 93 CBaseFeature + 80. 98 75. 50 78. 14 5 CoNLL 2005 Shared Task Test WSJ, Baseline,, 5 : 1), S SBAR,, ( F1 2. 3 %), ( F1 0. 23 %) Bot h +,

6 : 17 5 CoNLL 2005 Shared Task TestWSJ SRL (= 0. 9) IBaseFeature IBaseFeature + CBaseFeature CBaseFeature CBaseFeature + CBaseFeature + CBaseFeature + Both Precision 80. 67 80. 56 80. 83 80. 96 81. 17 Recall 75. 21 75. 88 76. 07 76. 32 76. 47 F1 77. 85 78. 15 78. 38 78. 57 78. 75 2),,F1 0. 42 % 3), SRL, Baseline77. 85 % 78. 75 % 5. 4 SRL Punyakanoc [ 12 ] CoNLL 2005 Shared Task,,, (Charniak),Surdeanu [13 ] [14 ] Pradhan,,,, ( Charniak), 6 Baseline Baseline + Bot h 6 : 1) Baseline [425,14 ],, [13214 ] 2) SRL, ( WSJ + Brown) F1 1. 22 0. 94 3) Brown SRL WSJ, F1 11. 13 %, :, SRL ;, SRL, 4) CoNLL 2005 Shared Task, Brown,, 6, ;, ; Overlap, Baseline ;, CoNLL 2005 Shared Task,, 6 SRL Development Test WSJ Test Brown Test WSJ + Brown P R F1 P R F1 P R F1 P R F1 Punyakanoc et al., 2005 80. 05 74. 83 77. 35 82. 28 76. 78 79. 44 73. 38 62. 93 67. 75 81. 18 74. 92 77. 92 Surdeanu et al., 2005 79. 14 71. 57 75. 17 80. 32 72. 95 76. 46 72. 41 59. 67 65. 42 79. 35 71. 17 75. 04, 2007 79. 65 71. 34 75. 49 81. 30 73. 37 77. 13 71. 65 60. 36 65. 52 80. 02 71. 65 75. 60 Baseline 79. 28 73. 58 76. 32 80. 67 75. 21 77. 85 70. 44 62. 93 66. 47 79. 35 73. 57 76. 35 Baseline + Both 80. 13 75. 11 77. 54 81. 17 76. 47 78. 75 71. 40 64. 22 67. 62 79. 91 74. 83 77. 29

18 2009,, The store opened last week, open N P open N P, The store A0,,,, (), : 1),CoNLL 2005 Shared Task 90 750,3 101, 10 20 % ; 1 795, 26 % ;2),, ;,, anes2 t hetic () antibiotic (),,, : [1 ] Carreras X. and M rquez L. Introduction to the CoN2 LL22004 Shared Task : Semantic Role Labeling [ C ]/ / Proceedings of CoNLL 2004 Shared Task. 2004. [2 ] Carreras X. and M rquez L. (2005). Introduction to the CoNLL22005 Shared Task : Semantic Role Labeling [ C]/ / Proceedings of CoNLL 2005 Shared Task. 2005. [3 ] Gildea D. and J uraf sky D. (2002). Automatic Labe2 ling of Semantic Roles [J ]. Computational Linguistics, 2002, 28 (3) :2452288. [4 ] Pradhan S., Hacioglu K., Krugler V. et al. (2005). Support Vector Learning for Semantic Argument Clas2 sification [ J ]. Machine Learning Journal, 2005, 60 (3) :11239. [5 ] Xue N. and Palmer M. (2004). Calibrating Features for Semantic Role Labeling [ C ]/ / Proceedings of EMNL P, 2004. [6 ] Moschitti A. (2004). A Study on Convolution Kernels for Shallow Statistic Parsing [ C ]/ / Proceedings of ACL22004,2004 : 3352342. [7 ] Che W., Zhang M., Liu T. and Li S. (2006). A Hy2 brid Convolution Tree Kernel for Semantic Role Labe2 ling [ C ]/ / Proceedings of the COL IN G/ ACL 2006 Main Conference Poster Sessions, 2006 : 73280. [8 ] Zhang M., Che W., AW A. T. et al. (2007). A Grammar2driven Convolution Tree Kernel for Semantic Role Classification [ C ]/ / Proceedings of ACL22007, 2007 : 2002207. [9 ] Palmer M., Gildea D. and Kingsbury P. The Proposi2 tion Bank : An Annotated Corpus of Semantic Roles [J ]. Computational Linguistics, 2005, 31 (1). [10 ] Charniak E. A Maximum2entropy Inspired Parser [ C]/ / Proceedings of NAACL22000,2000. [11 ] Collins M. ( 1999). Head2driven Statistical Models for Natural Language Parsing [ D ]. Ph. D. thesis, University of Pennsylvania. [12 ] Punyakanoc V., Koomen P., Roth Dan, and Yih W. (2005). Generalized Inference With Multiple Seman2 tic Role Labeling Systems [ C]/ / Proceedings of CoN2 LL22005,2005. [13 ] Surdeanu M. and Turmo J. Semantic Role Labeling Using Complete Syntactic Analysis [ R]/ / Proceedings of CoNLL22005,2005. [14 ],,. (2007). [J ]., 2007,18 (3) :5652573.