(1) [John (i) ] attacked [Bob (j) ]. Police arrested him (i).
|
|
- Kathlyn Bradley
- 6 years ago
- Views:
Transcription
1 1,a) 1,b) 1,c) 1,d) 1,e) e.g., man arrest e.g., man arrest man [Van de Crus 2014]. 1. [1], [2], [3] [4] [5] [6], [7], [8], [9] [10], [11] 1 Tohoku Universit a) masauki.ono@ecei.tohoku.ac.jp b) naoa-i@ecei.tohoku.ac.jp c) -matsu@ecei.tohoku.ac.jp d) okazaki@ecei.tohoku.ac.jp e) inui@ecei.tohoku.ac.jp (1) [John (i) ] attacked [Bob (j) ]. Police arrested him (i). John (i) Bob (j) attack attack arrest John him (1) (2) Police arrested [John]. (3) Police arrested [Bob]. c 2016 Information Processing Societ of Japan 1
2 John Bob arrest (4) Police arrested [John, who attacked Bob]. (5) Police arrested [Bob, whom John attacked]. Van de Crus [5] Resnik [4] WordNet [12] snset Rooth [6] EM Rooth Latent Dirichlet Allocation (LDA) [7], [8] Kawahara [9], [13] Chinese Restaurant Process (CRP) Erk [10], [11] Van de Crus [5] Mikolov [14] Van de Crus police arrest criminal subj 1 obj s(police) v(arrest) o(criminal) police arrest criminal SVO Ritter Van de Crus [5], [8], [15] 2 1 the police arrested him him John Bob Van de Crus Narrative Chain [16] X bankrupt acquire X [17], [18] [2], [3], [19] Modi [20] c 2016 Information Processing Societ of Japan 2
3 > [John (i) ] attacked [Bob (j) ]. 共参照 ( 正解 ) Police arrested him (i). [John (i) ] attacked [Bob (j) ]. 共参照 ( 不正解 ) Police arrested him (j). s(police) v(arrest) o(criminal) police arrest criminal s(police) v(arrest) o(tomato) police arrest tomato 2 SVO w o =criminal w o =tomato 3. Van de Crus SVO SVO police arrest [John] s(police) v(arrest) o(john) police arrest John police arrest [John, who attacked Bob] police arrest [Bob] s(police) v(arrest) o(bob) police arrest Bob police arrest [Bob, whom John attacked] Van de Crus SVO SVO SVO SVO 1 w s w v w o s(w s ), v(w v ), o(w o ) R d * 1 (police, arrest, criminal) 1 R h g[(w s, w v, w o )] (1), (2) g[(w s, w v, w o )] = (w s, w v, w o ) (1) (w s, w v, w o ) = f( (s(w s ) v(w v ) o(w o )) + b 1 ) (2) (concat) R h 3d, R 1 h b 1 f( ) tanh g[(w s, w v, w o )] SVO Collobert [21] 2 (,, ) (1) (2) (3) 2 w o w o (1) (3) s, v, o, b 1 *1 Van de Crus w s(w), v(w), o(w) s(police) v(arrest) s(police) v(arrest) police arrest W subj police arrest W obj s(john) v(attack) o(bob) s(john) v(attack) o(bob) John attack Bob John attack Bob 3 max(0, 1 g[(w s, w v, w o )] + g[( w s, w v, w o )]) +max(0, 1 g[(w s, w v, w o )] + g[(w s, w v, w o )]) +max(0, 1 g[(w s, w v, w o )] + g[( w s, w v, w o )]) (3) w s w o, (1) SVO (6) [John (i) ] attacked [Bob (j) ]. Police arrested him (i). (7) Police arrested [John, who attacked Bob]. (8) Police arrested [Bob, whom John attacked]., A of B [2], [3], [19] (,, ) - c 2016 Information Processing Societ of Japan 3
4 (7) John (John, arrest, Bob) John (8) Bob (John, arrest, Bob) Bob 3 3 Van de Crus [5] SVO [14],? a c a = c a,s, c a,v, c a,o c a c a,s c a,v c a,o c a a c a,s = a c a,o = a a c a,c {subj, obj} (9) A man attacked a woman. (...) A police arrested him. arrest a=man man c a A man attacked a woman 1 c a,s = man, c a,v = attack, c a,o = woman, c a,c = subj SVO s(w s ), o(w o ) s cmp (w s, c s ), o cmp (w o, c o ) g[(w s, w v, w o, c s, c o )] = (w s, w v, w o, c s, c o ) (4) (w s, w v, w o, c s, c o ) = f( (s cmp (w s, c s ) v(w v ) o cmp (w o, c o )) + b 1 ) (5) c a,c a man attacks a woman a woman attacks a man man s cmp s cmp (w s, c s ) = f(w subj (s(c s,s ) v(c s,v ) o(c s,o ))) f(w obj (s(c s,s ) v(c s,v ) o(c s,o ))) s(w s ) ifc s,c = subj; ifc s,c = obj; otherwise; c s,v c s,s, c s,o subj a man attacks W subj obj attack a man W obj a man attacked a woman a woman attacked a man o cmp o cmp (w o, c o ) = f(w subj (s(c o,s ) v(c o,v ) o(c o,o ))) f(w obj (s(c o,s ) v(c o,v ) o(c o,o ))) o(w o ) (6) ifc o,c = subj; ifc o,c = obj; otherwise; John attacked Bob John who attacked Bob Ji Recursive Neural Network (RNN) Entit Vector [22] Entit Vector RNN Ji Entit Vector Entit Vector Entit Vector (7) c 2016 Information Processing Societ of Japan 4
5 4.1 SVO s, v, o, b 1 W subj, W obj SVO - - tanh (-1,1) SVO s, v, o tanh Tpe A:,, Tpe B1:,,, Tpe B2:,,, (10) [The old man (i) ] attacked [a bo (j) ]. A policeman arrested the man (i). the man (i) The old man (i) Tpe A policeman, arrest, man Tpe B2 policeman, arrest, man, man, attack, bo, subj Web ClueWeb12 * Stanford CoreNLP [23] *3 Tpe A, Tpe B1, Tpe B2 3 (1) (2) % (1) ,177,864 1,282,994 *2 *3 (2) (1) 1,140,266 (1) 3 (10) Tpe A policeman, arrest, man policeman, arrest, book Tpe B2 policeman, arrest, man, man, attack, bo, subj policeman, arrest, banana, monke, eat, banana, obj (2) Tpe A 3 Tpe B1, Tpe B2 (10) policeman, arrest, man, man, attack, bo, subj policeman, arrest, man, man, attack, bo, obj 3 2 (8) log σ(g[(w s, w v, w o )]) log(1 σ(g[( w s, w v, w o )])) log(1 σ(g[(w s, w v, w o )])) log(1 σ(g[( w s, w v, w o )])) σ w s, w o 5. [2], [3], [19] 5.1 (8) c 2016 Information Processing Societ of Japan 5
6 Mean Reciprocal Rank (MRR) MRR = 1 Q 1 (9) Q rank i=1 i Q rank(i) i (4) pseudodisambiguation test pseudo-disambiguation test,, CEAF, BLANC [24], [25], etc. OntoNotes 4.0 [26] OntoNotes Web (11) In his (p) 40-minute speech (i), Chen (j) declared the determination (k) of the people (l) of Taiwan (m) to continue forward... ( )... Chen (n) visited...( ), and he (p ) stated... (ectb 1025 ) he p his p (1) 3 (2) (3) he (p ) his (p) i, j, k, l, m 1 MRR SVO Entit Vector Entit Vector Entit Vector * 4 Entit Vector Chen (j) Chen (n) MRR rank Chen (j) c j = Chen, declared, determination, subj 1 c j = ϕ OntoNotes 4, d = AdaGrad [27] 1,000 20,, 3 Van de Crus [5] SVO 4 (1) 2 (2) 2, (3) 2, Entit Vector MRR 1 1 SVO *4 *4 Wilcoxon c 2016 Information Processing Societ of Japan 6
7 SVO % 25.00% 50.00% 75.00% % 4 people eat book eat people eat food, /2 MRR * % 100% *5 SVO 26,500,536 7,323,250 MRR (0.3866) 2 MRR baseline (SVO ) in restaurant incorrectl (12) All these were the highest levels in histor. Since being put into operation five ears ago, the Tianjin Port Bonded Area (i) has completed the construction of China s first goods distribution center, functions like a customs port, opened up the special use the railwa line from... Moreover, the bonded area (j) has implemented a series of preferential policies towards enterprises entering the area (k) : it (pro) has established a sstem of no custom accounting and established manuals and management... (chtb 0099) it (pro) the Tianjin Port Bonded Area (i), the bonded area (j), the area (k) 3 SVO SVO it (pro) establish area X (X which completed construction) i X (X which implemented series) j establish c 2016 Information Processing Societ of Japan 7
8 6. 1 recurrent JSPS 15H01702, 15H05318, 15K16045 JST, CREST [1] Bergsma, S.: Discriminative Learning of Selectional Preference from Unlabeled Text, EMNLP, pp (online), available from (2008). [2] Rahman, A. and Ng, V.: Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge, Proceedings of EMNLP-CoNLL, pp (2012). [3] Peng, H., Khashabi, D. and Roth, D.: Solving Hard Coreference Problems, NAACL, pp (2015). [4] Resnik, P.: Selectional constraints: An informationtheoretic model and its computational realization, Cognition, Vol. 61, No. 1, pp (1996). [5] Van de Crus, T.: A neural network approach to selectional preference acquisition, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp (2014). [6] Rooth, M., Riezler, S., Prescher, D., Carroll, G. and Beil, F.: Inducing a Semanticall Annotated Lexicon via EM- Based Clustering, Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, College Park, Marland, USA, Association for Computational Linguistics, pp (1999). [7] Séaghdha, D. O.: Latent variable models of selectional preference, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp (2010). [8] Ritter, A., Etzioni, O. et al.: A latent dirichlet allocation method for selectional preferences, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp (2010). [9] Kawahara, D., Peterson, D. W. and Palmer, M.: A Stepwise Usage-based Method for Inducing Polsem-aware Verb Classes, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp (2014). [10] Erk, K.: A simple, similarit-based model for selectional preferences, ACL, Vol. 45, No. 1, p. 216 (2007). [11] Erk, K., Padó, S. and Padó, U.: A flexible, corpusdriven model of regular and inverse selectional preferences, Computational Linguistics, Vol. 36, No. 4, pp (2010). [12] Fellbaum, C.: WordNet: An Electronic Lexical Database, Bradford Books (1998). [13] Kawahara, D. and Kurohashi, S.: Case frame compilation from the web using high-performance computing, NAACL, pp. 1 7 (2006). [14] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. and Dean, J.: Distributed representations of words and phrases and their compositionalit, Advances in neural information processing sstems, pp (2013). [15] Van de Crus, T., Poibeau, T. and Korhonen, A.: A tensor-based factorization model of semantic compositionalit, Conference of the North American Chapter of the Association of Computational Linguistics (HTL- NAACL), pp (2013). [16] Chambers, N. and Jurafsk, D.: Unsupervised Learning of Narrative Schemas and their Participants, ACL, pp (2009). [17] Lin, D. and Pantel, P.: DIRT: discover of inference rules from text, KDD 01: Proceedings of the seventh ACM SIGKDD international conference, pp (2001). [18] J. Berant, T. A. and Goldberger, J.: Global Learning of Tped Entailment Rules, ACL, pp (2008). [19] Inoue, N., Ovchinnikova, E., Inui, K. and Hobbs, J.: Coreference Resolution with ILP-based Weighted Abduction, COLING, pp (2012). [20] Modi, A. and Titov, I.: Inducing neural models of script knowledge, CoNLL-2014, p. 49 (2014). [21] Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K. and Kuksa, P.: Natural language processing (almost) from scratch, The Journal of Machine Learning Research, Vol. 12, pp (2011). [22] Ji, Y. and Eisenstein, J.: One Vector is Not Enough: Entit-Augmented Distributed Semantics for Discourse Relations (2015). [23] Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J. and McClosk, D.: The Stanford CoreNLP Natural Language Processing Toolkit, Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: Sstem Demonstrations, pp (2014). [24] Luo, X.: On Coreference Resolution Performance Metrics, HLT/EMNLP, pp (2005). [25] Recasens, M. and Hov, E. H.: BLANC: Implementing the Rand Index for Coreference Evaluation, Journal of Natural Language Engineering (2010). [26] Hov, E., Marcus, M., Palmer, M., Ramshaw, L. and c 2016 Information Processing Societ of Japan 8
9 Weischedel, R.: OntoNotes: the 90% solution, Proceedings of the human language technolog conference of the NAACL, Companion Volume: Short Papers, Association for Computational Linguistics, pp (2006). [27] Duchi, J., Hazan, E. and Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization, The Journal of Machine Learning Research, Vol. 12, pp (2011). c 2016 Information Processing Societ of Japan 9
a) b) (Natural Language Processing; NLP) (Deep Learning) Bag of words White House RGB [1] IBM
c 1. (Natural Language Processing; NLP) (Deep Learning) RGB IBM 135 8511 5 6 52 yutat@jp.ibm.com a) b) 2. 1 0 2 1 Bag of words White House 2 [1] 2015 4 Copyright c by ORSJ. Unauthorized reproduction of
More informationLearning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations
Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations Fei Sun, Jiafeng Guo, Yanyan Lan, Jun Xu, and Xueqi Cheng CAS Key Lab of Network Data Science and Technology Institute
More informationCoreference Resolution with! ILP-based Weighted Abduction
Coreference Resolution with! ILP-based Weighted Abduction Naoya Inoue, Ekaterina Ovchinnikova, Kentaro Inui, Jerry Hobbs Tohoku University, Japan! ISI/USC, USA Motivation Long-term goal: unified framework
More informationAcquiring Strongly-related Events using Predicate-argument Co-occurring Statistics and Caseframe
1 1 16 Web 96% 79.1% 2 Acquiring Strongly-related Events using Predicate-argument Co-occurring Statistics and Caseframe Tomohide Shibata 1 and Sadao Kurohashi 1 This paper proposes a method for automatically
More informationThe Winograd Schema Challenge and Reasoning about Correlation
The Winograd Schema Challenge and Reasoning about Correlation Daniel Bailey University of Nebraska at Omaha dbailey@unomahaedu Abstract The Winograd Schema Challenge is an alternative to the Turing Test
More informationCoreference Resolution with ILP-based Weighted Abduction
Coreference Resolution with ILP-based Weighted Abduction N ao ya Inoue 1 Ekaterina Ovchinnikova 2 Kentaro Inui 1 Jer r y Hobbs 2 (1) Tohoku University, 6-6-05 Aoba, Aramaki, Aoba-ku, Sendai, 980-8579,
More informationA Hierarchical Bayesian Model for Unsupervised Induction of Script Knowledge
A Hierarchical Bayesian Model for Unsupervised Induction of Script Knowledge Lea Frermann Ivan Titov Manfred Pinkal April, 28th 2014 1 / 22 Contents 1 Introduction 2 Technical Background 3 The Script Model
More informationWord2Vec Embedding. Embedding. Word Embedding 1.1 BEDORE. Word Embedding. 1.2 Embedding. Word Embedding. Embedding.
c Word Embedding Embedding Word2Vec Embedding Word EmbeddingWord2Vec 1. Embedding 1.1 BEDORE 0 1 BEDORE 113 0033 2 35 10 4F y katayama@bedore.jp Word Embedding Embedding 1.2 Embedding Embedding Word Embedding
More informationDeep Learning for NLP Part 2
Deep Learning for NLP Part 2 CS224N Christopher Manning (Many slides borrowed from ACL 2012/NAACL 2013 Tutorials by me, Richard Socher and Yoshua Bengio) 2 Part 1.3: The Basics Word Representations The
More informationRecognizing Implicit Discourse Relations through Abductive Reasoning with Large-scale Lexical Knowledge
Recognizing Implicit Discourse Relations through Abductive Reasoning with Large-scale Lexical Knowledge Jun Sugiura, Naoya Inoue, and Kentaro Inui Tohoku University, 6-3-09 Aoba, Aramaki, Aoba-ku, Sendai,
More informationarxiv: v1 [cs.cl] 19 Nov 2015
Gaussian Mixture Embeddings for Multiple Word Prototypes Xinchi Chen, Xipeng Qiu, Jingxiang Jiang, Xuaning Huang Shanghai Key Laboratory of Intelligent Information Processing, Fudan University School of
More informationWeb-Mining Agents. Multi-Relational Latent Semantic Analysis. Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme
Web-Mining Agents Multi-Relational Latent Semantic Analysis Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Tanya Braun (Übungen) Acknowledgements Slides by: Scott Wen-tau
More informationDecentralized Entity-Level Modeling for Coreference Resolution
Decentralized Entity-Level Modeling for Coreference Resolution Greg Durrett, David Hall, and Dan Klein Computer Science Division University of California, Berkeley {gdurrett,dlwh,klein@cs.berkeley.edu
More informationAutomatically Evaluating Text Coherence using Anaphora and Coreference Resolution
1 1 Barzilay 1) Automatically Evaluating Text Coherence using Anaphora and Coreference Resolution Ryu Iida 1 and Takenobu Tokunaga 1 We propose a metric for automatically evaluating discourse coherence
More informationCMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss
CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss Jeffrey Flanigan Chris Dyer Noah A. Smith Jaime Carbonell School of Computer Science, Carnegie Mellon University, Pittsburgh,
More informationDISTRIBUTIONAL SEMANTICS
COMP90042 LECTURE 4 DISTRIBUTIONAL SEMANTICS LEXICAL DATABASES - PROBLEMS Manually constructed Expensive Human annotation can be biased and noisy Language is dynamic New words: slangs, terminology, etc.
More informationLearning Features from Co-occurrences: A Theoretical Analysis
Learning Features from Co-occurrences: A Theoretical Analysis Yanpeng Li IBM T. J. Watson Research Center Yorktown Heights, New York 10598 liyanpeng.lyp@gmail.com Abstract Representing a word by its co-occurrences
More informationAnnotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems
Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems Shyam Upadhyay 1 Ming-Wei Chang 2 1 University of Illinois at Urbana-Champaign, IL, USA 2 Microsoft Research, Redmond,
More informationLearning Multi-Relational Semantics Using Neural-Embedding Models
Learning Multi-Relational Semantics Using Neural-Embedding Models Bishan Yang Cornell University Ithaca, NY, 14850, USA bishan@cs.cornell.edu Wen-tau Yih, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research
More informationarxiv: v2 [cs.cl] 1 Jan 2019
Variational Self-attention Model for Sentence Representation arxiv:1812.11559v2 [cs.cl] 1 Jan 2019 Qiang Zhang 1, Shangsong Liang 2, Emine Yilmaz 1 1 University College London, London, United Kingdom 2
More informationLinguistically Aware Coreference Evaluation Metrics
Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP), Nagoya, Japan, October 2013, pp. 1366--1374. Linguistically Aware Coreference Evaluation Metrics Chen Chen
More informationData Mining & Machine Learning
Data Mining & Machine Learning CS57300 Purdue University April 10, 2018 1 Predicting Sequences 2 But first, a detour to Noise Contrastive Estimation 3 } Machine learning methods are much better at classifying
More informationLatent Variable Models in NLP
Latent Variable Models in NLP Aria Haghighi with Slav Petrov, John DeNero, and Dan Klein UC Berkeley, CS Division Latent Variable Models Latent Variable Models Latent Variable Models Observed Latent Variable
More informationDeep Learning for NLP
Deep Learning for NLP CS224N Christopher Manning (Many slides borrowed from ACL 2012/NAACL 2013 Tutorials by me, Richard Socher and Yoshua Bengio) Machine Learning and NLP NER WordNet Usually machine learning
More information11/3/15. Deep Learning for NLP. Deep Learning and its Architectures. What is Deep Learning? Advantages of Deep Learning (Part 1)
11/3/15 Machine Learning and NLP Deep Learning for NLP Usually machine learning works well because of human-designed representations and input features CS224N WordNet SRL Parser Machine learning becomes
More informationDeep Learning For Mathematical Functions
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationSparse Word Embeddings Using l 1 Regularized Online Learning
Sparse Word Embeddings Using l 1 Regularized Online Learning Fei Sun, Jiafeng Guo, Yanyan Lan, Jun Xu, and Xueqi Cheng July 14, 2016 ofey.sunfei@gmail.com, {guojiafeng, lanyanyan,junxu, cxq}@ict.ac.cn
More informationAn Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition
An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition Yu-Seop Kim 1, Jeong-Ho Chang 2, and Byoung-Tak Zhang 2 1 Division of Information and Telecommunication
More informationLearning Entity and Relation Embeddings for Knowledge Graph Completion
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Learning Entity and Relation Embeddings for Knowledge Graph Completion Yankai Lin 1, Zhiyuan Liu 1, Maosong Sun 1,2, Yang Liu
More informationTopic Models and Applications to Short Documents
Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text
More informationContinuous Space Language Model(NNLM) Liu Rong Intern students of CSLT
Continuous Space Language Model(NNLM) Liu Rong Intern students of CSLT 2013-12-30 Outline N-gram Introduction data sparsity and smooth NNLM Introduction Multi NNLMs Toolkit Word2vec(Deep learing in NLP)
More informationVector-based Models of Semantic Composition. Jeff Mitchell and Mirella Lapata, 2008
Vector-based Models of Semantic Composition Jeff Mitchell and Mirella Lapata, 2008 Composition in Distributional Models of Semantics Jeff Mitchell and Mirella Lapata, 2010 Distributional Hypothesis Words
More informationOverview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop
Overview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop Project Tips Announcement: Hint for PSet1: Understand
More informationBayesian Paragraph Vectors
Bayesian Paragraph Vectors Geng Ji 1, Robert Bamler 2, Erik B. Sudderth 1, and Stephan Mandt 2 1 Department of Computer Science, UC Irvine, {gji1, sudderth}@uci.edu 2 Disney Research, firstname.lastname@disneyresearch.com
More informationTuning as Linear Regression
Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical
More informationYou Lu, Jeff Lund, and Jordan Boyd-Graber. Why ADAGRAD Fails for Online Topic Modeling. Empirical Methods in Natural Language Processing, 2017.
You Lu, Jeff Lund, and Jordan Boyd-Graber. Why ADAGRAD Fails for Online Topic Modeling. mpirical Methods in Natural Language Processing, 2017. @inproceedings{lu:lund:boyd-graber-2017, Author = {You Lu
More informationToponym Disambiguation using Ontology-based Semantic Similarity
Toponym Disambiguation using Ontology-based Semantic Similarity David S Batista 1, João D Ferreira 2, Francisco M Couto 2, and Mário J Silva 1 1 IST/INESC-ID Lisbon, Portugal {dsbatista,msilva}@inesc-id.pt
More informationIntegrating Order Information and Event Relation for Script Event Prediction
Integrating Order Information and Event Relation for Script Event Prediction Zhongqing Wang 1,2, Yue Zhang 2 and Ching-Yun Chang 2 1 Soochow University, China 2 Singapore University of Technology and Design
More informationIncorporating Pre-Training in Long Short-Term Memory Networks for Tweets Classification
Incorporating Pre-Training in Long Short-Term Memory Networks for Tweets Classification Shuhan Yuan, Xintao Wu and Yang Xiang Tongji University, Shanghai, China, Email: {4e66,shxiangyang}@tongji.edu.cn
More informationTopics in Natural Language Processing
Topics in Natural Language Processing Shay Cohen Institute for Language, Cognition and Computation University of Edinburgh Lecture 9 Administrativia Next class will be a summary Please email me questions
More informationModeling Topics and Knowledge Bases with Embeddings
Modeling Topics and Knowledge Bases with Embeddings Dat Quoc Nguyen and Mark Johnson Department of Computing Macquarie University Sydney, Australia December 2016 1 / 15 Vector representations/embeddings
More informationDependency Parsing. Statistical NLP Fall (Non-)Projectivity. CoNLL Format. Lecture 9: Dependency Parsing
Dependency Parsing Statistical NLP Fall 2016 Lecture 9: Dependency Parsing Slav Petrov Google prep dobj ROOT nsubj pobj det PRON VERB DET NOUN ADP NOUN They solved the problem with statistics CoNLL Format
More informationLatent Dirichlet Allocation Introduction/Overview
Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models
More informationNeural Word Embeddings from Scratch
Neural Word Embeddings from Scratch Xin Li 12 1 NLP Center Tencent AI Lab 2 Dept. of System Engineering & Engineering Management The Chinese University of Hong Kong 2018-04-09 Xin Li Neural Word Embeddings
More informationIncrementally Learning the Hierarchical Softmax Function for Neural Language Models
Incrementally Learning the Hierarchical Softmax Function for Neural Language Models Hao Peng Jianxin Li Yangqiu Song Yaopeng Liu Department of Computer Science & Engineering, Beihang University, Beijing
More informationCombining Static and Dynamic Information for Clinical Event Prediction
Combining Static and Dynamic Information for Clinical Event Prediction Cristóbal Esteban 1, Antonio Artés 2, Yinchong Yang 1, Oliver Staeck 3, Enrique Baca-García 4 and Volker Tresp 1 1 Siemens AG and
More informationSemantic Role Labeling and Constrained Conditional Models
Semantic Role Labeling and Constrained Conditional Models Mausam Slides by Ming-Wei Chang, Nick Rizzolo, Dan Roth, Dan Jurafsky Page 1 Nice to Meet You 0: 2 ILP & Constraints Conditional Models (CCMs)
More informationLinking people in videos with their names using coreference resolution (Supplementary Material)
Linking people in videos with their names using coreference resolution (Supplementary Material) Vignesh Ramanathan, Armand Joulin, Percy Liang, Li Fei-Fei Department of Electrical Engineering, Stanford
More informationLarge Scale Semi-supervised Linear SVM with Stochastic Gradient Descent
Journal of Computational Information Systems 9: 15 (2013) 6251 6258 Available at http://www.jofcis.com Large Scale Semi-supervised Linear SVM with Stochastic Gradient Descent Xin ZHOU, Conghui ZHU, Sheng
More informationScoring Coreference Partitions of Predicted Mentions: A Reference Implementation
Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation Sameer Pradhan 1, Xiaoqiang Luo 2, Marta Recasens 3, Eduard Hovy 4, Vincent Ng 5 and Michael Strube 6 1 Harvard Medical
More informationREVISITING KNOWLEDGE BASE EMBEDDING AS TEN-
REVISITING KNOWLEDGE BASE EMBEDDING AS TEN- SOR DECOMPOSITION Anonymous authors Paper under double-blind review ABSTRACT We study the problem of knowledge base KB embedding, which is usually addressed
More informationOptimizing Sentence Modeling and Selection for Document Summarization
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Optimizing Sentence Modeling and Selection for Document Summarization Wenpeng Yin 1 and Yulong Pei
More informationNatural Language Processing (CSE 517): Sequence Models
Natural Language Processing (CSE 517): Sequence Models Noah Smith c 2018 University of Washington nasmith@cs.washington.edu April 27, 2018 1 / 60 High-Level View of Viterbi The decision about Y l is a
More informationConditional Word Embedding and Hypothesis Testing via Bayes-by-Backprop
Conditional Word Embedding and Hypothesis Testing via Bayes-by-Backprop Rujun Han Information Sciences Institute University of Southern California rujunhan@usc.edu Michael Gill Center for Data Science
More informationSemantics and Pragmatics of NLP Pronouns
Semantics and Pragmatics of NLP Pronouns School of Informatics University of Edinburgh Outline Observations About Data 1 Observations of what factors influence the way pronouns get resolved 2 Some algorithms
More informationDiscovering Geographical Topics in Twitter
Discovering Geographical Topics in Twitter Liangjie Hong, Lehigh University Amr Ahmed, Yahoo! Research Alexander J. Smola, Yahoo! Research Siva Gurumurthy, Twitter Kostas Tsioutsiouliklis, Twitter Overview
More informationLDA Based Similarity Modeling for Question Answering
LDA Based Similarity Modeling for Question Answering Asli Celikyilma Computer Science Department University of California, Berkeley asli@eecsberkeleyedu Dilek Hakkani-Tur International Computer Science
More informationImprovements to Training an RNN parser
Improvements to Training an RNN parser Richard J. BILLINGSLEY James R. CURRAN School of Information Technologies University of Sydney NSW 2006, Australia {richbill,james}@it.usyd.edu.au ABSTRACT Many parsers
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other
More informationUtilizing Portion of Patent Families with No Parallel Sentences Extracted in Estimating Translation of Technical Terms
1 1 1 2 2 30% 70% 70% NTCIR-7 13% 90% 1,000 Utilizing Portion of Patent Families with No Parallel Sentences Extracted in Estimating Translation of Technical Terms Itsuki Toyota 1 Yusuke Takahashi 1 Kensaku
More informationBayesian Nonparametrics for Speech and Signal Processing
Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other
More informationChinese Zero Pronoun Resolution: A Joint Unsupervised Discourse-Aware Model Rivaling State-of-the-Art Resolvers
Chinese Zero Pronoun Resolution: A Joint Unsupervised Discourse-Aware Model Rivaling State-of-the-Art Resolvers Chen Chen and Vincent Ng Human Language Technology Research Institute University of Texas
More informationRecurrent Neural Networks (Part - 2) Sumit Chopra Facebook
Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech
More informationKnowledge Graph Completion with Adaptive Sparse Transfer Matrix
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Knowledge Graph Completion with Adaptive Sparse Transfer Matrix Guoliang Ji, Kang Liu, Shizhu He, Jun Zhao National Laboratory
More informationAdvanced Natural Language Processing Syntactic Parsing
Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm
More informationarxiv: v1 [cs.cl] 22 Jun 2015
Neural Transformation Machine: A New Architecture for Sequence-to-Sequence Learning arxiv:1506.06442v1 [cs.cl] 22 Jun 2015 Fandong Meng 1 Zhengdong Lu 2 Zhaopeng Tu 2 Hang Li 2 and Qun Liu 1 1 Institute
More informationMining coreference relations between formulas and text using Wikipedia
Mining coreference relations between formulas and text using Wikipedia Minh Nghiem Quoc 1, Keisuke Yokoi 2, Yuichiroh Matsubayashi 3 Akiko Aizawa 1 2 3 1 Department of Informatics, The Graduate University
More informationDriving Semantic Parsing from the World s Response
Driving Semantic Parsing from the World s Response James Clarke, Dan Goldwasser, Ming-Wei Chang, Dan Roth Cognitive Computation Group University of Illinois at Urbana-Champaign CoNLL 2010 Clarke, Goldwasser,
More informationAn Algorithm for Fast Calculation of Back-off N-gram Probabilities with Unigram Rescaling
An Algorithm for Fast Calculation of Back-off N-gram Probabilities with Unigram Rescaling Masaharu Kato, Tetsuo Kosaka, Akinori Ito and Shozo Makino Abstract Topic-based stochastic models such as the probabilistic
More informationMatrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations
Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations Alexandre Salle 1 Marco Idiart 2 Aline Villavicencio 1 1 Institute of Informatics 2 Physics Department
More informationarxiv: v2 [cs.cl] 20 Apr 2017
Syntax Aware LSM Model for Chinese Semantic Role Labeling Feng Qian 2, Lei Sha 1, Baobao Chang 1, Lu-chen Liu 2, Ming Zhang 2 1 Key Laboratory of Computational Linguistics, Ministry of Education, Peking
More informationMarrying Dynamic Programming with Recurrent Neural Networks
Marrying Dynamic Programming with Recurrent Neural Networks I eat sushi with tuna from Japan Liang Huang Oregon State University Structured Prediction Workshop, EMNLP 2017, Copenhagen, Denmark Marrying
More informationCollaborative Hotel Recommendation based on Topic and Sentiment of Review Comments
DEIM Forum 2017 P6-2 Collaborative Hotel Recommendation based on Topic and Sentiment of Abstract Review Comments Zhan ZHANG and Yasuhiko MORIMOTO Graduate School of Engineering, Hiroshima University Hiroshima
More informationEve: A Gradient Based Optimization Method with Locally and Globally Adaptive Learning Rates
Eve: A Gradient Based Optimization Method with Locally and Globally Adaptive Learning Rates Hiroaki Hayashi 1,* Jayanth Koushik 1,* Graham Neubig 1 arxiv:1611.01505v3 [cs.lg] 11 Jun 2018 Abstract Adaptive
More informationRaRE: Social Rank Regulated Large-scale Network Embedding
RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat
More informationIPSJ SIG Technical Report Vol.2014-MPS-100 No /9/25 1,a) 1 1 SNS / / / / / / Time Series Topic Model Considering Dependence to Multiple Topics S
1,a) 1 1 SNS /// / // Time Series Topic Model Considering Dependence to Multiple Topics Sasaki Kentaro 1,a) Yoshikawa Tomohiro 1 Furuhashi Takeshi 1 Abstract: This pater proposes a topic model that considers
More informationSmoothing for Bracketing Induction
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Smoothing for racketing Induction Xiangyu Duan, Min Zhang *, Wenliang Chen Soochow University, China Institute
More informationLearning Multi-Relational Semantics Using Neural-Embedding Models
Learning Multi-Relational Semantics Using Neural-Embedding Models Bishan Yang Cornell University Ithaca, NY, 14850, USA bishan@cs.cornell.edu Wen-tau Yih, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research
More informationText mining and natural language analysis. Jefrey Lijffijt
Text mining and natural language analysis Jefrey Lijffijt PART I: Introduction to Text Mining Why text mining The amount of text published on paper, on the web, and even within companies is inconceivably
More informationLearning to translate with neural networks. Michael Auli
Learning to translate with neural networks Michael Auli 1 Neural networks for text processing Similar words near each other France Spain dog cat Neural networks for text processing Similar words near each
More informationOnline Learning in Tensor Space
Online Learning in Tensor Space Yuan Cao Sanjeev Khudanpur Center for Language & Speech Processing and Human Language Technology Center of Excellence The Johns Hopkins University Baltimore, MD, USA, 21218
More informationUTD: Ensemble-Based Spatial Relation Extraction
UTD: Ensemble-Based Spatial Relation Extraction Jennifer D Souza and Vincent Ng Human Language Technology Research Institute University of Texas at Dallas Richardson, TX 75083-0688 {jld082000,vince}@hlt.utdallas.edu
More informationDeep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationApplying hlda to Practical Topic Modeling
Joseph Heng lengerfulluse@gmail.com CIST Lab of BUPT March 17, 2013 Outline 1 HLDA Discussion 2 the nested CRP GEM Distribution Dirichlet Distribution Posterior Inference Outline 1 HLDA Discussion 2 the
More informationCS230: Lecture 8 Word2Vec applications + Recurrent Neural Networks with Attention
CS23: Lecture 8 Word2Vec applications + Recurrent Neural Networks with Attention Today s outline We will learn how to: I. Word Vector Representation i. Training - Generalize results with word vectors -
More informationarxiv: v2 [cs.cl] 11 May 2013
Two SVDs produce more focal deep learning representations arxiv:1301.3627v2 [cs.cl] 11 May 2013 Hinrich Schütze Center for Information and Language Processing University of Munich, Germany hs999@ifnlp.org
More informationGloVe: Global Vectors for Word Representation 1
GloVe: Global Vectors for Word Representation 1 J. Pennington, R. Socher, C.D. Manning M. Korniyenko, S. Samson Deep Learning for NLP, 13 Jun 2017 1 https://nlp.stanford.edu/projects/glove/ Outline Background
More informationQuasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti
Quasi-Synchronous Phrase Dependency Grammars for Machine Translation Kevin Gimpel Noah A. Smith 1 Introduction MT using dependency grammars on phrases Phrases capture local reordering and idiomatic translations
More informationNote on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing
Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationAdaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis. July 31, 2014
Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis July 31, 2014 Semantic Composition Principle of Compositionality The meaning of a complex expression
More informationSemantics with Dense Vectors
Speech and Language Processing Daniel Jurafsky & James H Martin Copyright c 2016 All rights reserved Draft of August 7, 2017 CHAPTER 16 Semantics with Dense Vectors In the previous chapter we saw how to
More informationNeural Networks for NLP. COMP-599 Nov 30, 2016
Neural Networks for NLP COMP-599 Nov 30, 2016 Outline Neural networks and deep learning: introduction Feedforward neural networks word2vec Complex neural network architectures Convolutional neural networks
More informationLECTURER: BURCU CAN Spring
LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can
More informationCS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012
CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 Parsing Problem Semantics Part of Speech Tagging NLP Trinity Morph Analysis
More informationarxiv: v3 [cs.cl] 22 Jun 2017
Optimizing Differentiable Relaxations of Coreference Evaluation Metrics Phong Le 1 Ivan Titov 1,2 1 ILLC, University of Amsterdam 2 ILCC, School of Informatics, University of Edinburgh p.le@uva.nl ititov@inf.ed.ac.uk
More informationNatural Language Processing (CSE 517): Sequence Models (II)
Natural Language Processing (CSE 517): Sequence Models (II) Noah Smith c 2016 University of Washington nasmith@cs.washington.edu February 3, 2016 1 / 40 Full Viterbi Procedure Input: x, θ, γ, π Output:
More informationLearning and Inference over Constrained Output
Learning and Inference over Constrained Output Vasin Punyakanok Dan Roth Wen-tau Yih Dav Zimak Department of Computer Science University of Illinois at Urbana-Champaign {punyakan, danr, yih, davzimak}@uiuc.edu
More informationLecture 22 Exploratory Text Analysis & Topic Models
Lecture 22 Exploratory Text Analysis & Topic Models Intro to NLP, CS585, Fall 2014 http://people.cs.umass.edu/~brenocon/inlp2014/ Brendan O Connor [Some slides borrowed from Michael Paul] 1 Text Corpus
More information