frequencies that matter in sentence processing

Similar documents
The relation of surprisal and human processing

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung

Latent Variable Models in NLP

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Probabilistic Context-free Grammars

A Context-Free Grammar

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Probabilistic Context Free Grammars. Many slides from Michael Collins

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Journal of Memory and Language

Natural Language Processing

Recap: Lexicalized PCFGs (Fall 2007): Lecture 5 Parsing and Syntax III. Recap: Charniak s Model. Recap: Adding Head Words/Tags to Trees

LECTURER: BURCU CAN Spring

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti

Features of Statistical Parsers

N-grams. Motivation. Simple n-grams. Smoothing. Backoff. N-grams L545. Dept. of Linguistics, Indiana University Spring / 24

Spectral Unsupervised Parsing with Additive Tree Metrics

CS460/626 : Natural Language

Multiword Expression Identification with Tree Substitution Grammars

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

DT2118 Speech and Speaker Recognition

The Noisy Channel Model and Markov Models

Algorithms for Syntax-Aware Statistical Machine Translation

X-bar theory. X-bar :

The Infinite PCFG using Hierarchical Dirichlet Processes

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP NP PP 1.0. N people 0.

Processing/Speech, NLP and the Web

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005

Parsing with Context-Free Grammars

TnT Part of Speech Tagger

Multilevel Coarse-to-Fine PCFG Parsing

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

Lecture 5: UDOP, Dependency Grammars

Deep Learning for NLP

Computational Linguistics

Statistical Methods for NLP

A Supertag-Context Model for Weakly-Supervised CCG Parser Learning

CS 6120/CS4120: Natural Language Processing

Advanced Natural Language Processing Syntactic Parsing

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

Parsing with Context-Free Grammars

Computational Linguistics. Acknowledgements. Phrase-Structure Trees. Dependency-based Parsing

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Introduction to Probablistic Natural Language Processing

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

c(a) = X c(a! Ø) (13.1) c(a! Ø) ˆP(A! Ø A) = c(a)

Chapter 14 (Partially) Unsupervised Parsing

Language Modeling. Michael Collins, Columbia University

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Decoding and Inference with Syntactic Translation Models

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

Statistical Machine Translation

Recurrent neural network grammars

Natural Language Processing

Language Processing with Perl and Prolog

Natural Language Processing SoSe Words and Language Model

CS460/626 : Natural Language

Natural Language Processing SoSe Language Modelling. (based on the slides of Dr. Saeedeh Momtazi)

Statistical methods in NLP, lecture 7 Tagging and parsing

Algorithms for NLP. Language Modeling III. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Language Models & Hidden Markov Models

N-gram Language Modeling Tutorial

Introduction to Semantics. The Formalization of Meaning 1

Ch. 2: Phrase Structure Syntactic Structure (basic concepts) A tree diagram marks constituents hierarchically

Regularization Introduction to Machine Learning. Matt Gormley Lecture 10 Feb. 19, 2018

Effectiveness of complex index terms in information retrieval

CS 545 Lecture XVI: Parsing

Lecture Notes on Inductive Definitions

Deep Learning Basics Lecture 10: Neural Language Models. Princeton University COS 495 Instructor: Yingyu Liang

Roger Levy Probabilistic Models in the Study of Language draft, October 2,

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

N-gram Language Modeling

Better! Faster! Stronger*!

Natural Language Processing

CS 662 Sample Midterm

The Language Modeling Problem (Fall 2007) Smoothed Estimation, and Language Modeling. The Language Modeling Problem (Continued) Overview

Prenominal Modifier Ordering via MSA. Alignment

10/17/04. Today s Main Points

A* Search. 1 Dijkstra Shortest Path

Dependency Parsing. Statistical NLP Fall (Non-)Projectivity. CoNLL Format. Lecture 9: Dependency Parsing

National Centre for Language Technology School of Computing Dublin City University

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs

Transformational Priors Over Grammars

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation

Bringing machine learning & compositional semantics together: central concepts

Natural Language Processing

Semantics 2 Part 1: Relative Clauses and Variables

SYNTHER A NEW M-GRAM POS TAGGER

Lecture 9: Hidden Markov Model

Lecture Notes on Inductive Definitions

Recap: Language models. Foundations of Natural Language Processing Lecture 4 Language Models: Evaluation and Smoothing. Two types of evaluation in NLP

Maximum Entropy Models for Natural Language Processing

NLP Homework: Dependency Parsing with Feed-Forward Neural Network

Transcription:

frequencies that matter in sentence processing Marten van Schijndel 1 September 7, 2015 1 Department of Linguistics, The Ohio State University van Schijndel Frequencies that matter September 7, 2015 0 / 69

frequencies are important van Schijndel Frequencies that matter September 7, 2015 1 / 69

frequencies are important Occurrence frequencies describe languages well Zipf Statistical NLP (esp vector spaces) van Schijndel Frequencies that matter September 7, 2015 1 / 69

frequencies are important Occurrence frequencies describe languages well Zipf Statistical NLP (esp vector spaces) Occurrence frequencies have major influence on sentence processing Behavioral measures (eg, reading times) Processing measures (eg, ERPs) Uniform Information Density Saarland SFB-1102 van Schijndel Frequencies that matter September 7, 2015 1 / 69

null hypothesis demands control van Schijndel Frequencies that matter September 7, 2015 2 / 69

null hypothesis demands control Linguists must control for frequencies to do research van Schijndel Frequencies that matter September 7, 2015 2 / 69

null hypothesis demands control Linguists must control for frequencies to do research How do people try to account for frequencies? van Schijndel Frequencies that matter September 7, 2015 2 / 69

Case Study 1: Cloze Probabilities van Schijndel, Culicover, & Schuler (2014) Pertains to: Pickering & Traxler (2003), inter alia van Schijndel Frequencies that matter September 7, 2015 3 / 69

generation/cloze probabilities Ask subjects to generate distribution van Schijndel Frequencies that matter September 7, 2015 4 / 69

generation/cloze probabilities Ask subjects to generate distribution Sentence generation norming: Write sentences with these words landed, sneezed, laughed, van Schijndel Frequencies that matter September 7, 2015 4 / 69

generation/cloze probabilities Ask subjects to generate distribution Sentence generation norming: Write sentences with these words landed, sneezed, laughed, Cloze norming: Complete this sentence The pilot landed van Schijndel Frequencies that matter September 7, 2015 4 / 69

generation/cloze probabilities Ask subjects to generate distribution Sentence generation norming: Write sentences with these words landed, sneezed, laughed, Cloze norming: Complete this sentence The pilot landed the plane van Schijndel Frequencies that matter September 7, 2015 4 / 69

generation/cloze probabilities Ask subjects to generate distribution Sentence generation norming: Write sentences with these words landed, sneezed, laughed, Cloze norming: Complete this sentence The pilot landed the plane The pilot landed in the field van Schijndel Frequencies that matter September 7, 2015 4 / 69

generation/cloze probabilities Ask subjects to generate distribution Sentence generation norming: Write sentences with these words landed, sneezed, laughed, Cloze norming: Complete this sentence NP:The pilot landed the plane PP: The pilot landed in the field Pickering & Traxler (2003) used 6 cloze tasks to determine frequencies van Schijndel Frequencies that matter September 7, 2015 4 / 69

generation/cloze probabilities Ask subjects to generate distribution Sentence generation norming: Write sentences with these words landed, sneezed, laughed, Cloze norming: Complete this sentence NP:The pilot landed the plane PP: The pilot landed in the field 25% 40% Pickering & Traxler (2003) used 6 cloze tasks to determine frequencies van Schijndel Frequencies that matter September 7, 2015 4 / 69

pickering & traxler (2003) Stimuli (1) That s the plane that the pilot landed behind in the fog (2) That s the truck that the pilot landed behind in the fog Readers slow down at landed in (2) van Schijndel Frequencies that matter September 7, 2015 5 / 69

pickering & traxler (2003) Stimuli (1) That s the plane that the pilot landed behind in the fog (2) That s the truck that the pilot landed behind in the fog Readers slow down at landed in (2) Suggests they try to link truck as the object of landed despite: landed biased for PP complement 40% PP complement 25% NP complement van Schijndel Frequencies that matter September 7, 2015 5 / 69

pickering & traxler (2003) Readers initially adopt a transitive interpretation despite subcat bias van Schijndel Frequencies that matter September 7, 2015 6 / 69

pickering & traxler (2003) Readers initially adopt a transitive interpretation despite subcat bias Early-attachment processing heuristic van Schijndel Frequencies that matter September 7, 2015 6 / 69

pickering & traxler (2003) Readers initially adopt a transitive interpretation despite subcat bias Early-attachment processing heuristic But what about syntactic frequencies? van Schijndel Frequencies that matter September 7, 2015 6 / 69

generalized categorial grammar (gcg) Nguyen et al (2012) N N A-aN D N-aD N-rN V-gN the apple that D N N-aD V-aN-gN V-aN-bN the girl ate van Schijndel Frequencies that matter September 7, 2015 7 / 69

generalized categorial grammar (gcg) Nguyen et al (2012) N N A-aN D N-aD N-rN V -gn the apple that D N N-aD V-aN -gn V-aN-bN the girl ate van Schijndel Frequencies that matter September 7, 2015 7 / 69

what about syntactic frequencies? Pickering & Traxler (2003) (1) That s the plane that the pilot landed behind in the fog (2) That s the truck that the pilot landed behind in the fog (a) VP-gNP (b) VP-gNP VP-gNP PP VP PP-gNP TV t i P NP IV P t i landed behind Transitive landed behind Intransitive van Schijndel Frequencies that matter September 7, 2015 8 / 69

what about syntactic frequencies? Pickering & Traxler (2003) (1) That s the plane that the pilot landed behind in the fog (2) That s the truck that the pilot landed behind in the fog (a) VP-gNP (b) VP-gNP VP-gNP PP VP PP-gNP TV t i P NP IV P t i landed behind Transitive landed behind Intransitive van Schijndel Frequencies that matter September 7, 2015 9 / 69

what about syntactic frequencies? Pickering & Traxler (2003) (1) That s the plane that the pilot landed behind in the fog (2) That s the truck that the pilot landed behind in the fog (a) VP-gNP (b) VP-gNP VP-gNP PP VP PP-gNP TV t i P NP IV P t i landed behind Transitive landed behind Intransitive van Schijndel Frequencies that matter September 7, 2015 10 / 69

what about syntactic frequencies? Pickering & Traxler (2003) (1) That s the plane that the pilot landed behind in the fog (2) That s the truck that the pilot landed behind in the fog van Schijndel et al (2014) Using syntactic probabilities with cloze data: van Schijndel Frequencies that matter September 7, 2015 11 / 69

what about syntactic frequencies? Pickering & Traxler (2003) (1) That s the plane that the pilot landed behind in the fog (2) That s the truck that the pilot landed behind in the fog van Schijndel et al (2014) Using syntactic probabilities with cloze data: P(Transitive landed) 0016 P(Intransitive landed) 0004 van Schijndel Frequencies that matter September 7, 2015 11 / 69

what about syntactic frequencies? Pickering & Traxler (2003) (1) That s the plane that the pilot landed behind in the fog (2) That s the truck that the pilot landed behind in the fog van Schijndel et al (2014) Using syntactic probabilities with cloze data: P(Transitive landed) 0016 P(Intransitive landed) 0004 Transitive interpretation is 300% more likely! van Schijndel Frequencies that matter September 7, 2015 11 / 69

results: case study 1 Subcat processing accounted for by hierarchic syntactic frequencies Early attachment heuristic unnecessary van Schijndel Frequencies that matter September 7, 2015 12 / 69

results: case study 1 Subcat processing accounted for by hierarchic syntactic frequencies Early attachment heuristic unnecessary Also applies to heavy-np shift heuristics (Staub, 2006), unaccusative processing (Staub et al, 2007), etc van Schijndel Frequencies that matter September 7, 2015 12 / 69

results: case study 1 Subcat processing accounted for by hierarchic syntactic frequencies Early attachment heuristic unnecessary Also applies to heavy-np shift heuristics (Staub, 2006), unaccusative processing (Staub et al, 2007), etc Suggests cloze probabilities are insufficient as a frequency control van Schijndel Frequencies that matter September 7, 2015 12 / 69

results: case study 1 Subcat processing accounted for by hierarchic syntactic frequencies Early attachment heuristic unnecessary Also applies to heavy-np shift heuristics (Staub, 2006), unaccusative processing (Staub et al, 2007), etc Suggests cloze probabilities are insufficient as a frequency control But do people use hierarchic syntactic probabilities? van Schijndel Frequencies that matter September 7, 2015 12 / 69

Case Study 2: N-grams and Syntactic Probabilities van Schijndel & Schuler (2015) Pertains to: Frank & Bod (2011), inter alia van Schijndel Frequencies that matter September 7, 2015 13 / 69

case study 2: overview Previous studies have debated whether humans use hierarchic syntax van Schijndel Frequencies that matter September 7, 2015 14 / 69

case study 2: overview Previous studies have debated whether humans use hierarchic syntax NP NP RC D N IN VP the apple that D NP N V ate the girl van Schijndel Frequencies that matter September 7, 2015 14 / 69

case study 2: overview Previous studies have debated whether humans use hierarchic syntax NP NP RC D N IN VP the apple that D NP N V ate the girl But how robust were their models? van Schijndel Frequencies that matter September 7, 2015 14 / 69

case study 2: overview This work shows that: van Schijndel Frequencies that matter September 7, 2015 15 / 69

case study 2: overview This work shows that: N-gram models can be greatly improved (accumulation) van Schijndel Frequencies that matter September 7, 2015 15 / 69

case study 2: overview This work shows that: N-gram models can be greatly improved (accumulation) Hierarchic syntax is still predictive over stronger baseline (Long distance dependencies independently improve model) van Schijndel Frequencies that matter September 7, 2015 15 / 69

hierarchic syntax in reading? The red 1 2 apple that the girl ate Frank & Bod (2011) van Schijndel Frequencies that matter September 7, 2015 16 / 69

hierarchic syntax in reading? The red 1 2 apple that the girl ate Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) van Schijndel Frequencies that matter September 7, 2015 16 / 69

hierarchic syntax in reading? The w 1 red w2 apple w 3 that w 4 the w 5 girl w 6 ate Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) van Schijndel Frequencies that matter September 7, 2015 16 / 69

hierarchic syntax in reading? The red apple that the 4 chars {}}{ girl w 6 ate Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) van Schijndel Frequencies that matter September 7, 2015 16 / 69

hierarchic syntax in reading? The red apple that the 4 chars {}}{ girl w 6 ate Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) van Schijndel Frequencies that matter September 7, 2015 16 / 69

hierarchic syntax in reading? The red apple that the girl ate Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) Test POS Predictors: Echo State Network (ESN) Phrase Structure Grammar (PSG) van Schijndel Frequencies that matter September 7, 2015 16 / 69

hierarchic syntax in reading? DT ADJ NN IN DT NN VBD Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) Test POS Predictors: Echo State Network (ESN) Phrase Structure Grammar (PSG) van Schijndel Frequencies that matter September 7, 2015 16 / 69

hierarchic syntax in reading? NP NP RC DT NP IN VP JJ NN DT NP NN VBD Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) Test POS Predictors: Echo State Network (ESN) Phrase Structure Grammar (PSG) van Schijndel Frequencies that matter September 7, 2015 16 / 69

hierarchic syntax in reading? Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) Test POS Predictors: Echo State Network (ESN) Phrase Structure Grammar (PSG) van Schijndel Frequencies that matter September 7, 2015 17 / 69

hierarchic syntax in reading? Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) Test POS Predictors: Echo State Network (ESN) Phrase Structure Grammar (PSG) Outcome: PSG < ESN + PSG ESN = ESN + PSG van Schijndel Frequencies that matter September 7, 2015 17 / 69

hierarchic syntax in reading? Frank & Bod (2011) Baseline: Sentence Position Word length Test POS Predictors: Echo State Network (ESN) Phrase Structure Grammar (PSG) N-grams (Unigram, bigram) Outcome: PSG < ESN + PSG ESN = ESN + PSG Sequential helps over hierarchic van Schijndel Frequencies that matter September 7, 2015 17 / 69

hierarchic syntax in reading? Frank & Bod (2011) Baseline: Sentence Position Word length N-grams (Unigram, bigram) Test POS Predictors: Echo State Network (ESN) Phrase Structure Grammar (PSG) Outcome: PSG < ESN + PSG ESN = ESN + PSG Hierarchic doesn t help over sequential van Schijndel Frequencies that matter September 7, 2015 17 / 69

hierarchic syntax in reading? Fossum & Levy (2012) Replicated Frank & Bod (2011): PSG < ESN + PSG ESN = ESN + PSG van Schijndel Frequencies that matter September 7, 2015 18 / 69

hierarchic syntax in reading? Fossum & Levy (2012) Replicated Frank & Bod (2011): PSG < ESN + PSG ESN = ESN + PSG Better n-gram baseline (more data) changes result: PSG = ESN + PSG ESN = ESN + PSG van Schijndel Frequencies that matter September 7, 2015 18 / 69

hierarchic syntax in reading? Fossum & Levy (2012) Replicated Frank & Bod (2011): PSG < ESN + PSG ESN = ESN + PSG Better n-gram baseline (more data) changes result: PSG = ESN + PSG Sequential doesn t help over hierarchic ESN = ESN + PSG van Schijndel Frequencies that matter September 7, 2015 18 / 69

hierarchic syntax in reading? Fossum & Levy (2012) Replicated Frank & Bod (2011): PSG < ESN + PSG ESN = ESN + PSG Better n-gram baseline (more data) changes result: PSG = ESN + PSG Sequential doesn t help over hierarchic ESN = ESN + PSG Also: lexicalized syntax improves PSG fit van Schijndel Frequencies that matter September 7, 2015 18 / 69

improved n-gram baseline Most previous reading time studies: N-grams trained on WSJ, Dundee, BNC (or less) van Schijndel Frequencies that matter September 7, 2015 19 / 69

improved n-gram baseline Most previous reading time studies: N-grams trained on WSJ, Dundee, BNC (or less) This study: N-grams trained on Gigaword 40 van Schijndel Frequencies that matter September 7, 2015 19 / 69

improved n-gram baseline Most previous reading time studies: N-grams trained on WSJ, Dundee, BNC (or less) Unigrams/Bigrams This study: N-grams trained on Gigaword 40 van Schijndel Frequencies that matter September 7, 2015 19 / 69

improved n-gram baseline 20 N gram Perplexity 19 18 Perplexity 17 16 15 14 1 2 3 4 5 6 N van Schijndel Frequencies that matter September 7, 2015 20 / 69

improved n-gram baseline Most previous reading time studies: N-grams trained on WSJ, Dundee, BNC Unigrams/Bigrams This study: N-grams trained on Gigaword 40 5-grams van Schijndel Frequencies that matter September 7, 2015 21 / 69

improved n-gram baseline Most previous reading time studies: N-grams trained on WSJ, Dundee, BNC Unigrams/Bigrams Only from region boundaries This study: N-grams trained on Gigaword 40 5-grams van Schijndel Frequencies that matter September 7, 2015 21 / 69

improved n-gram baseline Bigram Example Reading time of girl after red The red 1 apple that the 2 girl } {{ } region X : bigram target ate X: bigram condition van Schijndel Frequencies that matter September 7, 2015 22 / 69

improved n-gram baseline Bigram Example Reading time of girl after red The red 1 apple that the 2 girl } {{ } region X : bigram target ate X: bigram condition Fails to capture entire sequence; van Schijndel Frequencies that matter September 7, 2015 22 / 69

improved n-gram baseline Bigram Example Reading time of girl after red The red 1 apple that the 2 girl } {{ } region X : bigram target ate X: bigram condition Fails to capture entire sequence; Conditions never generated; van Schijndel Frequencies that matter September 7, 2015 22 / 69

improved n-gram baseline Bigram Example Reading time of girl after red The red 1 apple that the 2 girl } {{ } region X : bigram target ate X: bigram condition Fails to capture entire sequence; Conditions never generated; Probability of sequence is deficient van Schijndel Frequencies that matter September 7, 2015 22 / 69

improved n-gram baseline Cumulative Bigram Example Reading time of girl after red: The 1 red apple that the 2 girl ate X : bigram targets X: bigram conditions van Schijndel Frequencies that matter September 7, 2015 23 / 69

improved n-gram baseline Cumulative Bigram Example Reading time of girl after red: The 1 red apple that the 2 girl ate X : bigram targets X: bigram conditions Captures entire sequence; Well-formed sequence probability; Reflects processing that must be done by humans van Schijndel Frequencies that matter September 7, 2015 23 / 69

improved n-gram baseline Most previous reading time studies: N-grams trained on WSJ, Dundee, BNC Unigrams/Bigrams Only from region boundaries This study: N-grams trained on Gigaword 40 5-grams van Schijndel Frequencies that matter September 7, 2015 24 / 69

improved n-gram baseline Most previous reading time studies: N-grams trained on WSJ, Dundee, BNC Unigrams/Bigrams Only from region boundaries This study: N-grams trained on Gigaword 40 5-grams Cumulative and Non-cumulative van Schijndel Frequencies that matter September 7, 2015 24 / 69

evaluation Dundee Corpus (Kennedy et al, 2003) 10 subjects 2,388 sentences First pass durations ( 200,000) Go-past durations ( 200,000) Exclusions: Unknown words (<5 tokens) First and last of each line Regions larger than 4 words (track loss) van Schijndel Frequencies that matter September 7, 2015 25 / 69

cumu-n-grams predict reading times Baseline: Fixed Effects Sentence Position Word length Region Length Preceding word fixated? Random Effects Item/Subject Intercepts By Subject Slopes: All Fixed Effects N-grams (5-grams) N-grams (Cumu-5-grams) van Schijndel Frequencies that matter September 7, 2015 26 / 69

cumu-n-grams predict reading times Baseline: Fixed Effects Sentence Position Word length Region Length Preceding word fixated? Random Effects Item/Subject Intercepts By Subject Slopes: All Fixed Effects N-grams (5-grams) N-grams (Cumu-5-grams) van Schijndel Frequencies that matter September 7, 2015 26 / 69

cumu-n-grams predict reading times First Pass and Go-Past van Schijndel Frequencies that matter September 7, 2015 27 / 69

follow-up questions Is hierarchic surprisal useful over the better baseline? van Schijndel Frequencies that matter September 7, 2015 28 / 69

follow-up questions Is hierarchic surprisal useful over the better baseline? If so, can it be similarly improved through accumulation? van Schijndel Frequencies that matter September 7, 2015 28 / 69

follow-up questions Is hierarchic surprisal useful over the better baseline? If so, can it be similarly improved through accumulation? van Schijndel & Schuler (2013) found it could over weaker baselines Grammar: Berkeley parser, WSJ, 5 split-merge cycles (Petrov & Klein 2007) van Schijndel Frequencies that matter September 7, 2015 28 / 69

hierarchic surprisal predicts reading times Baseline: Fixed Effects Same as before N-grams (5-grams) N-grams (Cumu-5-grams) van Schijndel Frequencies that matter September 7, 2015 29 / 69

hierarchic surprisal predicts reading times Baseline: Fixed Effects Same as before N-grams (5-grams) N-grams (Cumu-5-grams) Random Effects Same as before By Subject Slopes: Hierarchic surprisal Cumu-Hierarchic surprisal van Schijndel Frequencies that matter September 7, 2015 29 / 69

hierarchic surprisal predicts reading times Baseline: Fixed Effects Same as before N-grams (5-grams) N-grams (Cumu-5-grams) Random Effects Same as before By Subject Slopes: Hierarchic surprisal Cumu-Hierarchic surprisal van Schijndel Frequencies that matter September 7, 2015 29 / 69

hierarchic surprisal predicts reading times First Pass and Go-Past van Schijndel Frequencies that matter September 7, 2015 30 / 69

cumulative surprisal doesn t help?! Suggests previous findings were due to weaker n-gram baseline van Schijndel Frequencies that matter September 7, 2015 31 / 69

cumulative surprisal doesn t help?! Suggests previous findings were due to weaker n-gram baseline Suggests only local PCFG surprisal affects reading times van Schijndel Frequencies that matter September 7, 2015 31 / 69

cumulative surprisal doesn t help?! Suggests previous findings were due to weaker n-gram baseline Suggests only local PCFG surprisal affects reading times Follow-up work shows long distance dependencies independently influence reading times van Schijndel Frequencies that matter September 7, 2015 31 / 69

results: case study 2 Hierarchic syntax predicts reading times over strong linear baseline van Schijndel Frequencies that matter September 7, 2015 32 / 69

results: case study 2 Hierarchic syntax predicts reading times over strong linear baseline Long-distance dependencies help over hierarchic syntax van Schijndel Frequencies that matter September 7, 2015 32 / 69

results: case study 2 Hierarchic syntax predicts reading times over strong linear baseline Long-distance dependencies help over hierarchic syntax Studies should use cumu-n-grams in their baselines van Schijndel Frequencies that matter September 7, 2015 32 / 69

what does this mean for our models? We need to carefully control for: Cloze probabilities (Smith 2011) van Schijndel Frequencies that matter September 7, 2015 33 / 69

what does this mean for our models? We need to carefully control for: Cloze probabilities (Smith 2011) N-gram frequencies (local and cumulative) van Schijndel Frequencies that matter September 7, 2015 33 / 69

what does this mean for our models? We need to carefully control for: Cloze probabilities (Smith 2011) N-gram frequencies (local and cumulative) Hierarchic syntactic frequencies van Schijndel Frequencies that matter September 7, 2015 33 / 69

what does this mean for our models? We need to carefully control for: Cloze probabilities (Smith 2011) N-gram frequencies (local and cumulative) Hierarchic syntactic frequencies Long distance dependency frequencies van Schijndel Frequencies that matter September 7, 2015 33 / 69

what does this mean for our models? We need to carefully control for: Cloze probabilities (Smith 2011) N-gram frequencies (local and cumulative) Hierarchic syntactic frequencies Long distance dependency frequencies (discourse, etc) Then we can try to interpret experimental results What do we do about convergence? Is there a way to avoid this explosion of control predictors? van Schijndel Frequencies that matter September 7, 2015 33 / 69

Case Study 3: Evading Frequency Confounds van Schijndel, Murphy, & Schuler (2015) van Schijndel Frequencies that matter September 7, 2015 34 / 69

motivation Can we measure memory load with fewer controls? van Schijndel Frequencies that matter September 7, 2015 35 / 69

motivation Can we measure memory load with fewer controls? Why do so many factors influence results? van Schijndel Frequencies that matter September 7, 2015 35 / 69

motivation Can we measure memory load with fewer controls? Why do so many factors influence results? Low dimensionality measures van Schijndel Frequencies that matter September 7, 2015 35 / 69

motivation Can we measure memory load with fewer controls? Why do so many factors influence results? Low dimensionality measures Do the factors become separable in another space? van Schijndel Frequencies that matter September 7, 2015 35 / 69

motivation Can we measure memory load with fewer controls? Why do so many factors influence results? Low dimensionality measures Do the factors become separable in another space? Let s try using MEG van Schijndel Frequencies that matter September 7, 2015 35 / 69

what is meg? van Schijndel Frequencies that matter September 7, 2015 36 / 69

what is meg? 102 locations van Schijndel Frequencies that matter September 7, 2015 36 / 69

how might meg reflect load? Jensen et al, (2012) van Schijndel Frequencies that matter September 7, 2015 37 / 69

how might meg reflect load? Jensen et al, (2012) van Schijndel Frequencies that matter September 7, 2015 38 / 69

where to look? Memory is a function of distributed processing van Schijndel Frequencies that matter September 7, 2015 39 / 69

where to look? Memory is a function of distributed processing Look for synchronized firing between sensors (brain regions) van Schijndel Frequencies that matter September 7, 2015 39 / 69

where to look? van Schijndel Frequencies that matter September 7, 2015 40 / 69

where to look? Memory is a function of distributed processing Look for synchronized firing between sensors (brain regions) This study uses spectral coherence measurements van Schijndel Frequencies that matter September 7, 2015 41 / 69

spectral coherence coherence(x, y) = E[S xy ] E[Sxx ] E[S yy ] van Schijndel Frequencies that matter September 7, 2015 42 / 69

spectral coherence coherence(x, y) = E[S xy ] E[Sxx ] E[S yy ] cross-correlation autocorrelations van Schijndel Frequencies that matter September 7, 2015 42 / 69

spectral coherence coherence(x, y) = E[S xy ] E[Sxx ] E[S yy ] cross-correlation autocorrelations Amount of connectivity (synchronization) not caused by chance van Schijndel Frequencies that matter September 7, 2015 42 / 69

spectral coherence: phase synchrony Fell & Axmacher (2011) van Schijndel Frequencies that matter September 7, 2015 43 / 69

audiobook meg corpus Collected 2 years ago at CMU van Schijndel Frequencies that matter September 7, 2015 44 / 69

audiobook meg corpus Collected 2 years ago at CMU 3 subjects van Schijndel Frequencies that matter September 7, 2015 44 / 69

audiobook meg corpus Collected 2 years ago at CMU 3 subjects Heart of Darkness, ch 2 12,342 words 80 (8 x 10) minutes Synched with parallel audio recording and forced alignment van Schijndel Frequencies that matter September 7, 2015 44 / 69

audiobook meg corpus Collected 2 years ago at CMU 3 subjects Heart of Darkness, ch 2 12,342 words 80 (8 x 10) minutes Synched with parallel audio recording and forced alignment 306-channel Elekta Neuromag, CMU Movement/noise correction: SSP, SSS, tsss Band-pass filtered 001 50 Hz Downsampled to 125 Hz Visually scanned for muscle artifacts; none found van Schijndel Frequencies that matter September 7, 2015 44 / 69

memory load via center embedding d1 The cart broke d2 that the man bought van Schijndel Frequencies that matter September 7, 2015 45 / 69

memory load via center embedding d1 The cart broke d2 that the man bought Depth annotations: van Schijndel et al, (2013) parser Nguyen et al, (2012) Generalized Categorial Grammar (GCG) van Schijndel Frequencies that matter September 7, 2015 45 / 69

data filtering Remove words: in short or long sentences (<4 or >50 words) that follow a word at another depth that fail to parse Partition data: Dev set: One third of corpus Test set: Two thirds of corpus van Schijndel Frequencies that matter September 7, 2015 46 / 69

compute coherence Group by factor Compute coherence over subsets of 4 epochs van Schijndel Frequencies that matter September 7, 2015 47 / 69

dev coherence Coherence (d 2 d 1 ) Frequency (Hz) Time (s) van Schijndel Frequencies that matter September 7, 2015 48 / 69

dev coherence +variance Coherence (d 2 d 1 ) Frequency (Hz) Time (s) van Schijndel Frequencies that matter September 7, 2015 49 / 69

possible confounds? Sentence position Unigram, Bigram, Trigram: COCA logprobs PCFG surprisal: parser output van Schijndel Frequencies that matter September 7, 2015 50 / 69

dev results Factor p-value Unigram 0941 Bigram 0257 Trigram 0073 PCFG Surprisal 0482 Sentence Position 0031 Depth 0005 Depth 1 (40 items) Depth 2 (1118 items) van Schijndel Frequencies that matter September 7, 2015 51 / 69

test results Factor p-value Unigram 06480 Bigram 07762 Trigram 00264 PCFG Surprisal 03295 Sentence Position 04628 Depth 000002 Depth 1 (86 items) Depth 2 (2142 items) van Schijndel Frequencies that matter September 7, 2015 52 / 69

test results Factor p-value Unigram 06480 Bigram 07762 Trigram 00264 PCFG Surprisal 03295 Sentence Position 04628 Depth 000002 Bonferroni correction removes trigrams, but van Schijndel Frequencies that matter September 7, 2015 53 / 69

compute coherence: increased resolution Group by factor Compute coherence over subsets of 6 epochs van Schijndel Frequencies that matter September 7, 2015 54 / 69

test results: increased resolution Factor p-value Trigram 03817 Depth 00046 Depth 1 (57 items) Depth 2 (1428 items) van Schijndel Frequencies that matter September 7, 2015 55 / 69

results: case study 3 Memory load is reflected in MEG connectivity Common confounds do not pose problems for oscillatory measures van Schijndel Frequencies that matter September 7, 2015 56 / 69

conclusions Cloze probabilities are insufficient as frequency control Hierarchic syntactic frequencies strongly influence processing Reading time studies need to use local and cumulative n-grams Oscillatory analyses could avoid control predictor explosion van Schijndel Frequencies that matter September 7, 2015 57 / 69

acknowledgements Stefan Frank, Matthew Traxler, Shari Speer, Roberto Zamparelli Attendees of CogSci 2014, CUNY 2015, NAACL 2015, CMCL 2015 OSU Linguistics Targeted Investment for Excellence (2012-2013) National Science Foundation (DGE-1343012) University of Pittsburgh Medical Center MEG Seed Fund National Institutes of Health CRCNS (5R01HD075328-02) van Schijndel Frequencies that matter September 7, 2015 58 / 69

thanks! questions? Cloze probabilities are insufficient as frequency control Hierarchic syntactic frequencies strongly influence processing Reading time studies need to use local and cumulative n-grams Oscillatory analyses could avoid control predictor explosion van Schijndel Frequencies that matter September 7, 2015 59 / 69

improved n-gram baseline 20 N gram Perplexity 19 18 Perplexity 17 16 15 14 1 2 3 4 5 6 N van Schijndel Frequencies that matter September 7, 2015 60 / 69

further improved n-gram baseline 20 N gram Perplexity 19 18 Perplexity 17 16 15 14 1 2 3 4 5 6 N van Schijndel Frequencies that matter September 7, 2015 61 / 69

further improved n-gram baseline 20 Pruned N gram Perplexity 19 18 Perplexity 17 16 pruned unpruned 15 14 1 2 3 4 5 6 N van Schijndel Frequencies that matter September 7, 2015 62 / 69

model How probable is each subtree? Wall Street Journal (WSJ) section of the Penn Treebank: a) VP-gNP (b) VP-gNP VP-gNP PP VP PP-gNP VP-gNP Adv P NP VP Adv P t i TV t i carefully behind IV carefully behind landed Transitive landed Intransitive van Schijndel Frequencies that matter September 7, 2015 63 / 69

model How probable is each subtree? Wall Street Journal (WSJ) section of the Penn Treebank: a) VP-gNP (b) VP-gNP VP-gNP PP VP PP-gNP VP-gNP Adv P NP VP Adv P t i TV landed t i carefully Transitive behind IV landed carefully behind Intransitive 017 001 van Schijndel Frequencies that matter September 7, 2015 63 / 69

model What is the probability of each interpretation? P(syntactic configuration) P(generating the verb from that tree) P(Transitive) = P(VP-gNP VP-gNP PP) P(verb TV) (1) P(Intransitive) = P(VP-gNP VP PP-gNP) P(verb IV) (2) van Schijndel Frequencies that matter September 7, 2015 64 / 69

model What is the probability of each interpretation? P(syntactic configuration) P(generating the verb from that tree) P(subcat bias)/p(preterminal prior) P(Transitive) = P(VP-gNP VP-gNP PP) P(verb TV) (1) P(Intransitive) = P(VP-gNP VP PP-gNP) P(verb IV) (2) van Schijndel Frequencies that matter September 7, 2015 65 / 69

model What is the probability of each interpretation? P(syntactic configuration) P(subcat bias)/p(preterminal prior) P(Transitive) = P(VP-gNP VP-gNP PP) P(verb TV) (1) P(VP-gNP VP-gNP PP) P(TV verb) P(TV) P(Intransitive) = P(VP-gNP VP PP-gNP) P(verb IV) (2) P(VP-gNP VP PP-gNP) P(IV verb) P(IV) van Schijndel Frequencies that matter September 7, 2015 66 / 69

model What are the preterminal priors? Relative prior probability from the WSJ: P(TV):P(IV) = 26: 1 van Schijndel Frequencies that matter September 7, 2015 67 / 69

model What is the probability of each interpretation? P(syntactic configuration) P(subcat bias)/p(preterminal prior) P(Transitive) P(VP-gNP VP-gNP PP) = 017 P(TV verb) 26 P(Intransitive) P(VP-gNP VP PP-gNP) = 001 P(IV verb) 10 P(TV verb) P(TV) P(IV verb) P(IV) (1) (2) van Schijndel Frequencies that matter September 7, 2015 68 / 69

model What is the probability of each interpretation? P(syntactic configuration) P(subcat bias)/p(preterminal prior) P(Transitive) P(VP-gNP VP-gNP PP) = 017 P(TV verb) 26 P(Intransitive) P(VP-gNP VP PP-gNP) = 001 P(IV verb) 10 P(TV verb) P(TV) = 0065 P(TV verb) (1) P(IV verb) P(IV) = 001 P(IV verb) (2) van Schijndel Frequencies that matter September 7, 2015 68 / 69

model What is the probability of each interpretation? P(syntactic configuration) P(subcat bias)/p(preterminal prior) P(Transitive) P(VP-gNP VP-gNP PP) = 017 P(TV verb) 26 P(Intransitive) P(VP-gNP VP PP-gNP) = 001 P(IV verb) 10 P(TV verb) P(TV) = 0065 P(TV verb) (1) P(IV verb) P(IV) = 001 P(IV verb) (2) Pickering & Traxler (2003) experimentally determined subcat biases for a wide variety of verbs van Schijndel Frequencies that matter September 7, 2015 68 / 69

evaluation Pickering & Traxler (2003) (1) That s the plane that the pilot landed carefully behind in the fog at the airport (2) That s the truck that the pilot landed carefully behind in the fog at the airport Using Pickering & Traxler s (2003) subcat bias data: van Schijndel Frequencies that matter September 7, 2015 69 / 69

evaluation Pickering & Traxler (2003) (1) That s the plane that the pilot landed carefully behind in the fog at the airport (2) That s the truck that the pilot landed carefully behind in the fog at the airport Using Pickering & Traxler s (2003) subcat bias data: P(Transitive landed) 017 025 26 = 0016 P(Intransitive landed) 001 040 10 = 0004 van Schijndel Frequencies that matter September 7, 2015 69 / 69