Driving Semantic Parsing from the World s Response

Similar documents
Semantic Parsing with Combinatory Categorial Grammars

A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

Introduction to Semantic Parsing with CCG

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Outline. Learning. Overview Details Example Lexicon learning Supervision signals

Learning Dependency-Based Compositional Semantics

Personal Project: Shift-Reduce Dependency Parsing

Margin-based Decomposed Amortized Inference

Outline. Learning. Overview Details

Semantic Role Labeling and Constrained Conditional Models

arxiv: v2 [cs.cl] 20 Aug 2016

Language to Logical Form with Neural Attention

Cross-lingual Semantic Parsing

Machine Learning for Structured Prediction

LECTURER: BURCU CAN Spring

TALP at GeoQuery 2007: Linguistic and Geographical Analysis for Query Parsing

Information Extraction from Text

6.036 midterm review. Wednesday, March 18, 15

Natural Language Processing (CSEP 517): Text Classification

Lecture 5: Semantic Parsing, CCGs, and Structured Classification

Lecture 13: Structured Prediction

AN ABSTRACT OF THE DISSERTATION OF

Semantic Role Labeling via Tree Kernel Joint Inference

Lecture 5 Neural models for NLP

Global Machine Learning for Spatial Ontology Population

Algorithms for Syntax-Aware Statistical Machine Translation

Unanimous Prediction for 100% Precision with Application to Learning Semantic Mappings

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

CS 188: Artificial Intelligence Spring Announcements

Learning and Inference over Constrained Output

Introduction to Data-Driven Dependency Parsing

Probabilistic Context Free Grammars. Many slides from Michael Collins

Multiword Expression Identification with Tree Substitution Grammars

Dialogue Systems. Statistical NLU component. Representation. A Probabilistic Dialogue System. Task: map a sentence + context to a database query

Polyhedral Outer Approximations with Application to Natural Language Parsing

Chapter 2 The Naïve Bayes Model in the Context of Word Sense Disambiguation

Statistical Methods for NLP

Algorithms for NLP. Classification II. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley

Learning Features from Co-occurrences: A Theoretical Analysis

Lab 12: Structured Prediction

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

A* Search. 1 Dijkstra Shortest Path

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.

ANLP Lecture 22 Lexical Semantics with Dense Vectors

Tuning as Linear Regression

Fast Computing Grammar-driven Convolution Tree Kernel for Semantic Role Labeling

NLP Programming Tutorial 11 - The Structured Perceptron

Natural Language Processing

Advanced Natural Language Processing Syntactic Parsing

Features of Statistical Parsers

Bringing machine learning & compositional semantics together: central concepts

Discrimina)ve Latent Variable Models. SPFLODD November 15, 2011

MIRA, SVM, k-nn. Lirong Xia

Probabilistic Context-free Grammars

ML4NLP Multiclass Classification

Linear Models for Classification: Discriminative Learning (Perceptron, SVMs, MaxEnt)

Learning with multiple models. Boosting.

Introduction to Machine Learning

Decoding and Inference with Syntactic Translation Models

Machine Learning for NLP

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Learning to translate with neural networks. Michael Auli

CS 188: Artificial Intelligence Spring Announcements

Linear Classifiers IV

A Support Vector Method for Multivariate Performance Measures

Grammars and introduction to machine learning. Computers Playing Jeopardy! Course Stony Brook University

Intelligent Systems (AI-2)

Machine Learning for natural language processing

Categorization ANLP Lecture 10 Text Categorization with Naive Bayes

ANLP Lecture 10 Text Categorization with Naive Bayes

Introduction to Computational Linguistics

Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning

Low-Dimensional Discriminative Reranking. Jagadeesh Jagarlamudi and Hal Daume III University of Maryland, College Park

CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss

Intelligent Systems (AI-2)

PAC Generalization Bounds for Co-training

6.891: Lecture 24 (December 8th, 2003) Kernel Methods

CMU-Q Lecture 24:

Integrating Morphology in Probabilistic Translation Models

Deep Learning for Natural Language Processing. Sidharth Mudgal April 4, 2017

Algorithms for NLP. Language Modeling III. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley

Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs

Unsupervised Rank Aggregation with Distance-Based Models

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs

Undirected Graphical Models

Pattern Recognition and Machine Learning. Perceptrons and Support Vector machines

Mining Classification Knowledge

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning

A proof theoretical account of polarity items and monotonic inference.

Chunking with Support Vector Machines

Generative Models for Sentences

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows

Linear & nonlinear classifiers

Mining Classification Knowledge

Structural Learning with Amortized Inference

Joint Inference for Event Timeline Construction

The efficiency of identifying timed automata and the power of clocks

Last Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression

The Research on Syntactic Features in Semantic Role Labeling

Transcription:

Driving Semantic Parsing from the World s Response James Clarke, Dan Goldwasser, Ming-Wei Chang, Dan Roth Cognitive Computation Group University of Illinois at Urbana-Champaign CoNLL 2010 Clarke, Goldwasser, Chang, Roth 1

What is Semantic Parsing? Meaning Representation make(coffee, sugar=0, milk=0.3) I d like a coffee with no sugar and just a little milk Clarke, Goldwasser, Chang, Roth 2

What is Semantic Parsing? Meaning Representation make(coffee, sugar=0, milk=0.3) I d like a coffee with no sugar and just a little milk Clarke, Goldwasser, Chang, Roth 2

Supervised Learning Problem text meaning Training algorithm Model Challenges: Structured Prediction problem Model part of the structure as hidden? Clarke, Goldwasser, Chang, Roth 3

Lots of previous work Multiple approaches to the problem: KRISP (Kate & Mooney 2006) SVM-based parser using string kernels. Zettlemoyer & Collins 2005; Zettlemoyer & Collins 2007 Probabilistic parser based on relaxed CCG grammars. WASP (Wong & Mooney 2006; Wong & Mooney 2007) Based on Synchronous CFG. Ge & Mooney 2009 Integrated syntactic and semantic parser. Clarke, Goldwasser, Chang, Roth 4

Lots of previous work Multiple approaches to the problem: KRISP (Kate & Mooney 2006) SVM-based parser using string kernels. Zettlemoyer & Collins 2005; Zettlemoyer & Collins 2007 Probabilistic parser based on relaxed CCG grammars. WASP (Wong & Mooney 2006; Wong & Mooney 2007) Based on Synchronous CFG. Ge & Mooney 2009 Integrated syntactic and semantic parser. Assumption: A training set consisting of natural language and meaning representation pairs. Clarke, Goldwasser, Chang, Roth 4

Using the World s response Meaning Representation make(coffee, sugar=0, milk=0.3) I d like a coffee with no sugar and just a little milk Clarke, Goldwasser, Chang, Roth 5

Using the World s response Meaning Representation make(coffee, sugar=0, milk=0.3) I d like a coffee with no sugar and just a little milk Good! Bad! Clarke, Goldwasser, Chang, Roth 5

Using the World s response Meaning Representation make(coffee, sugar=0, milk=0.3) I d like a coffee with no sugar and just a little milk Good! Bad! Question: Can we use feedback based on the response to provide supervision? Clarke, Goldwasser, Chang, Roth 5

This work We aim to: Reduce the burden of annotation for semantic parsing. We focus on: Using the World s response to learn a semantic parser. Developing new training algorithms to support this learning paradigm. A lightweight semantic parsing model that doesn t require annotated data. This results in: Learning a semantic parser using zero annotated meaning representations. Clarke, Goldwasser, Chang, Roth 6

Outline 1 Semantic Parsing 2 Learning DIRECT Approach AGGRESSIVE Approach 3 Semantic Parsing Model 4 Experiments Clarke, Goldwasser, Chang, Roth 7

Outline 1 Semantic Parsing 2 Learning DIRECT Approach AGGRESSIVE Approach 3 Semantic Parsing Model 4 Experiments Clarke, Goldwasser, Chang, Roth 8

Semantic Parsing INPUT x What is the largest state that borders Texas? HIDDEN y OUTPUT z largest(state(next_to(texas))) Clarke, Goldwasser, Chang, Roth 9

Semantic Parsing INPUT x What is the largest state that borders Texas? HIDDEN y OUTPUT z largest(state(next_to(texas))) F : X Z ẑ = F w (x) = arg max w T Φ(x, y, z) y Y,z Z Clarke, Goldwasser, Chang, Roth 9

Semantic Parsing INPUT x What is the largest state that borders Texas? HIDDEN y OUTPUT z largest(state(next_to(texas))) F : X Z ẑ = F w (x) = arg max w T Φ(x, y, z) y Y,z Z Model The nature of inference and feature functions. Learning Strategy How we obtain the weights. Clarke, Goldwasser, Chang, Roth 9

Semantic Parsing INPUT x What is the largest state that borders Texas? HIDDEN y OUTPUT z largest(state(next_to(texas))) Response r New Mexico F : X Z ẑ = F w (x) = arg max w T Φ(x, y, z) y Y,z Z Model The nature of inference and feature functions. Learning Strategy How we obtain the weights. Clarke, Goldwasser, Chang, Roth 9

Outline 1 Semantic Parsing 2 Learning DIRECT Approach AGGRESSIVE Approach 3 Semantic Parsing Model 4 Experiments Clarke, Goldwasser, Chang, Roth 10

Learning Inputs: Natural language sentences. Feedback : X Z {+1, 1}. Zero meaning representations. Clarke, Goldwasser, Chang, Roth 11

Learning Inputs: Natural language sentences. Feedback : X Z {+1, 1}. Zero meaning representations. { +1 if execute(z) = r Feedback(x, z) = 1 otherwise Clarke, Goldwasser, Chang, Roth 11

Learning Inputs: Natural language sentences. Feedback : X Z {+1, 1}. Zero meaning representations. Goal: A weight vector that scores the correct meaning representation higher than all other meaning representations. Response Driven Learning: Feedback Input text predict Meaning Representation apply to World Clarke, Goldwasser, Chang, Roth 11

Learning Strategies x 1 x 2 x 3. repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n Clarke, Goldwasser, Chang, Roth 12

Learning Strategies x 1 y 1 z 1 x 2 y 2 z 2 x 3 y 3 z 3... x n y n z n repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence y, z = arg max w T Φ(x, y, z) Clarke, Goldwasser, Chang, Roth 12

Learning Strategies x 1 y 1 z 1 +1 x 2 y 2 z 2 1 x 3 y 3 z 3 1.... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n y n z n 1 Clarke, Goldwasser, Chang, Roth 12

Learning Strategies x 1 y 1 z 1 +1 x 2 y 2 z 2 1 x 3 y 3 z 3 1.... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n y n z n 1 Clarke, Goldwasser, Chang, Roth 12

Outline 1 Semantic Parsing 2 Learning DIRECT Approach AGGRESSIVE Approach 3 Semantic Parsing Model 4 Experiments Clarke, Goldwasser, Chang, Roth 13

DIRECT Approach Binary Learning Feedback Input text predict Meaning Representation apply to World DIRECT Learn a binary classifier to discriminate between good and bad meaning representations. Clarke, Goldwasser, Chang, Roth 14

DIRECT Approach x 1 y 1 z 1 +1 x 2 y 2 z 2 1 x 3 y 3 z 3 1 Use (x, y, z) as a training example with label from feedback..... x n y n z n 1 Clarke, Goldwasser, Chang, Roth 15

DIRECT Approach x 1, y 1, z 1 +1 x 2, y 2, z 2 x 3, y 3, z 3. 1 1. Use (x, y, z) as a training example with label from feedback. Find w such that f w T Φ(x, y, z) > 0 x n, y n, z n 1 Clarke, Goldwasser, Chang, Roth 15

DIRECT Approach Each point represented by Φ(x, y, x) normalized by x Clarke, Goldwasser, Chang, Roth 16

DIRECT Approach w Learn a binary classifier to discriminate between good and bad meaning representations. Clarke, Goldwasser, Chang, Roth 16

DIRECT Approach repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence Clarke, Goldwasser, Chang, Roth 17

DIRECT Approach x 1 x 2 x 3. repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n Clarke, Goldwasser, Chang, Roth 17

DIRECT Approach x 1 y 1 z 1 x 2 y 2 z 2 x 3 y 3 z 3... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence y, z = arg max w T Φ(x, y, z) x n y n z n Clarke, Goldwasser, Chang, Roth 17

DIRECT Approach x 1 y 1 z 1 +1 x 2 y 2 z 2 +1 x 3 y 3 z 3 +1.... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n y n z n 1 Clarke, Goldwasser, Chang, Roth 17

DIRECT Approach x 1, y 1, z 1 +1 x 2, y 2, z 2 +1 x 3, y 3, z 3 +1.. repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n, y n, z n 1 Clarke, Goldwasser, Chang, Roth 17

DIRECT Approach w Clarke, Goldwasser, Chang, Roth 18

DIRECT Approach w Clarke, Goldwasser, Chang, Roth 18

DIRECT Approach w Clarke, Goldwasser, Chang, Roth 18

DIRECT Approach w Repeat until convergence! Clarke, Goldwasser, Chang, Roth 18

Outline 1 Semantic Parsing 2 Learning DIRECT Approach AGGRESSIVE Approach 3 Semantic Parsing Model 4 Experiments Clarke, Goldwasser, Chang, Roth 19

AGGRESSIVE Approach Structured Learning Feedback Input text predict Meaning Representation apply to World AGGRESSIVE Positive feedback is a good indicator of the correct meaning representation. Use data with positive feedback as training data for structured learning. Clarke, Goldwasser, Chang, Roth 20

AGGRESSIVE Approach x 1 y 1 z 1 +1 x 2 y 2 z 2 1 x 3 y 3 z 3 1.... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n y n z n 1 Clarke, Goldwasser, Chang, Roth 21

AGGRESSIVE Approach x 1 y 1 z 1 +1 x 2 y 2 z 2 1 Use items with positive feedback as training data for a structured learner. x 3 y 3 z 3 1.... x n y n z n 1 Clarke, Goldwasser, Chang, Roth 21

AGGRESSIVE Approach x 1 y 1 z 1 +1 x 2 y 2 z 2 1 Use items with positive feedback as training data for a structured learner. x 3 y 3 z 3 1.... x n y n z n 1 Clarke, Goldwasser, Chang, Roth 21

AGGRESSIVE Approach x 1 y 1 z 1 x 2 y 2 z 2 Use items with positive feedback as training data for a structured learner. x 3 y 3 z 3.... x n y n z n Clarke, Goldwasser, Chang, Roth 21

AGGRESSIVE Approach x 1 y 1 z 1 x 2 y 2 z 2 x 3 y 3 z 3... x n y n z n. Use items with positive feedback as training data for a structured learner. Implicitly consider all other meaning representations for these examples as bad. Find w such that w T Φ(x, y, z ) > w T Φ(x, y, z ) Clarke, Goldwasser, Chang, Roth 21

AGGRESSIVE Approach repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence Clarke, Goldwasser, Chang, Roth 22

AGGRESSIVE Approach x 1 x 2 x 3. repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n Clarke, Goldwasser, Chang, Roth 22

AGGRESSIVE Approach x 1 y 1 z 1 x 2 y 2 z 2 x 3 y 3 z 3... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence y, z = arg max w T Φ(x, y, z) x n y n z n Clarke, Goldwasser, Chang, Roth 22

AGGRESSIVE Approach x 1 y 1 z 1 +1 x 2 y 2 z 2 +1 x 3 y 3 z 3 1.... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n y n z n +1 Clarke, Goldwasser, Chang, Roth 22

AGGRESSIVE Approach x 1 y 1 z 1 +1 x 2 y 2 z 2 +1 x 3 y 3 z 3 1.... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n y n z n +1 Clarke, Goldwasser, Chang, Roth 22

AGGRESSIVE Approach x 1 y 1 z 1 x 2 y 2 z 2 x 3 y 3 z 3... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n y n z n Clarke, Goldwasser, Chang, Roth 22

AGGRESSIVE Approach x 1 y 1 z 1 x 2 y 2 z 2 x 3 y 3 z 3... repeat for all input sentences do Solve the inference problem Query Feedback function end for Learn a new w using feedback until Convergence x n y n z n Clarke, Goldwasser, Chang, Roth 22

Summary of Learning Strategies Learning Strategy Feedback Input text predict Meaning Representation apply to World DIRECT Uses both positive and negative feedback as examples to train a binary classifier. AGGRESSIVE Adapts the feedback signal and uses only positive feedback to train a structured predictor. Clarke, Goldwasser, Chang, Roth 23

Outline 1 Semantic Parsing 2 Learning DIRECT Approach AGGRESSIVE Approach 3 Semantic Parsing Model 4 Experiments Clarke, Goldwasser, Chang, Roth 24

Model INPUT x What is the largest state that borders Texas? HIDDEN y OUTPUT z largest(state(next_to(texas))) ẑ = F w (x) = arg max w T Φ(x, y, z) y Y,z Z First-order: Map lexical items. largest largest Second-order: Composition. next_to(state( )) or state(next_to( )) Inference procedure leverages the typing information of the domain. Clarke, Goldwasser, Chang, Roth 25

First-order Decisions How many people live in the state of Texas? Goal: population(state(texas))

First-order Decisions How many people live in the state of Texas? loc texas next_to state population null Goal: population(state(texas))

First-order Decisions How many people live in the state of Texas? loc texas next_to state population null Goal: population(state(texas))

First-order Decisions How many people live in the state of Texas? loc texas next_to state population null > texas texas Use a simple lexicon to bootstrap the > state process. state > population population > loc in > next_to next borders adjacent Goal: population(state(texas))

First-order Decisions How many people live in the state of Texas? loc texas next_to state population null > texas texas Use a simple lexicon to bootstrap the > state process. state > population population > loc in > next_to next borders adjacent Goal: population(state(texas))

First-order Decisions How many people live in the state of Texas? loc texas next_to state population null > texas texas Use a simple lexicon to bootstrap the > state process. state > population population > loc in > next_to next borders adjacent Goal: population(state(texas))

First-order Decisions How many people live in the state of Texas? loc texas next_to state population null > texas texas > state state > population population > loc in > next_to next borders adjacent Goal: population(state(texas)) Use a simple lexicon to bootstrap the process. Lexical resources help us move beyond the lexicon. wordnet_sim(people,population)

First-order Decisions How many people live in the state of Texas? loc texas next_to state population null > texas texas > state state > population population > loc in > next_to next borders adjacent Goal: population(state(texas)) Use a simple lexicon to bootstrap the process. Lexical resources help us move beyond the lexicon. wordnet_sim(people,population)

First-order Decisions How many people live in the state of Texas? loc texas next_to state population null > texas texas > state state > population population > loc in > next_to next borders adjacent Goal: population(state(texas)) Use a simple lexicon to bootstrap the process. Lexical resources help us move beyond the lexicon. wordnet_sim(people,population) Context helps disambiguate between choices.

Second-order Decisions How do we compose the predicates and constants. Domain dependent: Encode typing information inherent in the domain into the inference procedure. population(state( )) vs state(population( )) Features: Dependency path distance. Word position distance. Predicate bigrams. next_to(state( )) vs state(next_to( )) Clarke, Goldwasser, Chang, Roth 27

Outline 1 Semantic Parsing 2 Learning DIRECT Approach AGGRESSIVE Approach 3 Semantic Parsing Model 4 Experiments Clarke, Goldwasser, Chang, Roth 28

Evaluation Domain: GEOQUERY U.S Geographical Questions. Response 250. (x, r) pairs. Zero meaning representations. Query 250. (x) sentences. Evaluation metric: Accuracy (percentage of meaning representations that return the correct answer). Clarke, Goldwasser, Chang, Roth 29

Learning Behavior Algorithm R250 Q250 NOLEARN 22.2 DIRECT AGGRESSIVE SUPERVISED 87.6 80.4 Clarke, Goldwasser, Chang, Roth 30

Learning Behavior Algorithm R250 Q250 NOLEARN 22.2 DIRECT AGGRESSIVE SUPERVISED 87.6 80.4 NOLEARN used to initialize both learning approaches. Clarke, Goldwasser, Chang, Roth 30

Learning Behavior Algorithm R250 Q250 NOLEARN 22.2 DIRECT AGGRESSIVE SUPERVISED 87.6 80.4 Q: How good is our model when trained in a fully supervised manner? Clarke, Goldwasser, Chang, Roth 30

Learning Behavior Algorithm R250 Q250 NOLEARN 22.2 DIRECT AGGRESSIVE SUPERVISED 87.6 80.4 Q: How good is our model when trained in a fully supervised manner? A: 80% on test data. Other supervised methods range from 60% to 85% accuracy. Clarke, Goldwasser, Chang, Roth 30

Learning Behavior Algorithm R250 Q250 NOLEARN 22.2 DIRECT AGGRESSIVE SUPERVISED 87.6 80.4 Q: Is it possible to learn without any meaning representations? Clarke, Goldwasser, Chang, Roth 30

Learning Behavior Algorithm R250 Q250 NOLEARN 22.2 DIRECT 75.2 69.2 AGGRESSIVE 82.4 73.2 SUPERVISED 87.6 80.4 Q: Is it possible to learn without any meaning representations? A: Yes! A: Learns to cover more of the Response data set. A: And only 7% below the SUPERVISED upper bound. Clarke, Goldwasser, Chang, Roth 30

Learning Behavior Algorithm R250 Q250 NOLEARN 22.2 DIRECT 75.2 69.2 AGGRESSIVE 82.4 73.2 SUPERVISED 87.6 80.4 Q: Is it possible to learn without any meaning representations? A: Yes! A: Learns to cover more of the Response data set. A: And only 7% below the SUPERVISED upper bound. Clarke, Goldwasser, Chang, Roth 30

Learning Behavior Accuracy on Response 250 80 70 60 50 40 30 20 10 Initialization 0 0 1 2 3 4 5 6 Learning Iterations AGGRESSIVE DIRECT AGGRESSIVE correctly interprets 16% that DIRECT does not. 9% vice-versa. Leaving only 9% incorrect. Clarke, Goldwasser, Chang, Roth 31

Learning from Indirect Supervision Similar to indirect learning protocols: Learning a binary classifier with hidden explanation. Supervision only required for binary data. No labeled structures. NAACL 2010 (Chang, Goldwasser, Roth, Srikumar 2010a). Structured learning with binary and structured labels. Mix of supervision for binary data and structured data. Binary label indicates whether input has a good structure. ICML 2010 (Chang, Goldwasser, Roth, Srikumar 2010b). Clarke, Goldwasser, Chang, Roth 32

Conclusions Contributions: Response Driven Learning. A new learning paradigm that doesn t rely on annotated meaning representations. Supervised at the response level. Natural supervision signal. Two learning algorithms capable of working within response driven learning. A shallow semantic parsing model. Future work: Can we combine the two learning algorithms? Other semantic parsing domains? Response driven learning for other tasks? Clarke, Goldwasser, Chang, Roth 33