Towards Efficient String Processing of Annotated Events

Similar documents
A Universal Turing Machine

Finite-state representations embodying temporal relations

Definition: Let S and T be sets. A binary relation on SxT is any subset of SxT. A binary relation on S is any subset of SxS.

Intervals & events with & without points

Two perspectives on change & institutions

1 Alphabets and Languages

Theory of Computation p.1/?? Theory of Computation p.2/?? We develop examples of languages that are decidable

4. Sets The language of sets. Describing a Set. c Oksana Shatalov, Fall

COT3100 SI Final Exam Review

arxiv: v2 [cs.fl] 29 Nov 2013

Basic Equations and Inequalities. An equation is a statement that the values of two expressions are equal.

Prior and temporal sequences for natural language

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Topics in Logic and Proofs

MATH 2200 Final Review

1.1 Introduction to Sets

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

NP-Complete Problems. Complexity Class P. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar..

Comment: The induction is always on some parameter, and the basis case is always an integer or set of integers.

In this initial chapter, you will be introduced to, or more than likely be reminded of, a

1 Showing Recognizability

CHAPTER 1. Relations. 1. Relations and Their Properties. Discussion

Exclusive Disjunction

Predication via Finite-State Methods

Embedded systems specification and design

Course: Algebra 1-A Direct link to this page:

From Fundamentele Informatica 1: inleverdatum 1 april 2014, 13:45 uur. A Set A of the Same Size as B or Larger Than B. A itself is not.

1 The distributive law

Image schemas via finite-state methods structured category-theoretically

CPS 220 Theory of Computation REGULAR LANGUAGES

Mathematical Preliminaries. Sipser pages 1-28

2.1 Sets. Definition 1 A set is an unordered collection of objects. Important sets: N, Z, Z +, Q, R.

4. Sets The language of sets. Describing a Set. c Oksana Shatalov, Fall Set-builder notation (a more precise way of describing a set)

Sets. 1.1 What is a set?

Definition: A binary relation R from a set A to a set B is a subset R A B. Example:

2.23 Theorem. Let A and B be sets in a metric space. If A B, then L(A) L(B).

Entailments in finite-state temporality

Sparse analysis Lecture II: Hardness results for sparse approximation problems

Computational Models #1

ACS2: Decidability Decidability

Context-free grammars and languages

The Complexity of Maximum. Matroid-Greedoid Intersection and. Weighted Greedoid Maximization

Notes for Math 290 using Introduction to Mathematical Proofs by Charles E. Roberts, Jr.

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Intersections of Planes and Systems of Linear Equations

Computational Models - Lecture 1 1

Lecture notes on Turing machines

3515ICT: Theory of Computation. Regular languages

CpSc 421 Final Exam December 15, 2006

Introduction to Turing Machines. Reading: Chapters 8 & 9

Time to learn about NP-completeness!

Math Fall 2014 Final Exam Solutions

arxiv: v1 [cs.pl] 19 May 2016

Theory of Computation

Homework 1 Due September 20 M1 M2

(a) Definition of TMs. First Problem of URMs

34.1 Polynomial time. Abstract problems

Introduction to Automata

Johns Hopkins Math Tournament Proof Round: Automata

About the relationship between formal logic and complexity classes

Theory of Computation

Sets. We discuss an informal (naive) set theory as needed in Computer Science. It was introduced by G. Cantor in the second half of the nineteenth

CSCI3390-Lecture 6: An Undecidable Problem

Equivalence of Regular Expressions and FSMs

CS 125 Section #10 (Un)decidability and Probability November 1, 2016

NP, polynomial-time mapping reductions, and NP-completeness

2 Lecture 2: Logical statements and proof by contradiction Lecture 10: More on Permutations, Group Homomorphisms 31

The Unsolvability of the Halting Problem. Chapter 19

18.S097 Introduction to Proofs IAP 2015 Lecture Notes 1 (1/5/2015)

Deterministic finite Automata

COMPUTER SCIENCE TRIPOS

Theory of Computation 1 Sets and Regular Expressions

GENERATING SETS AND DECOMPOSITIONS FOR IDEMPOTENT TREE LANGUAGES

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Turing Machines. The Language Hierarchy. Context-Free Languages. Regular Languages. Courtesy Costas Busch - RPI 1

1 The decision problem for First order logic

Axiomatic set theory. Chapter Why axiomatic set theory?

Succinctness of the Complement and Intersection of Regular Expressions

Preliminaries to the Theory of Computation

CMPT307: Complexity Classes: P and N P Week 13-1

New Minimal Weight Representations for Left-to-Right Window Methods

A Finite-State Temporal Ontology and Event-intervals

Automata-based Verification - III

Lecture 2: Connecting the Three Models

CS 320, Fall Dr. Geri Georg, Instructor 320 NP 1

A function is a special kind of relation. More precisely... A function f from A to B is a relation on A B such that. f (x) = y

Analysis I (Math 121) First Midterm Correction

Section Summary. Relations and Functions Properties of Relations. Combining Relations

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam

Extended transition function of a DFA

SET THEORY. 1. Roster or Tabular form In this form the elements of the set are enclosed in curly braces { } after separating them by commas.

Mathematical Logic (IX)

CSE 200 Lecture Notes Turing machine vs. RAM machine vs. circuits

Sri vidya college of engineering and technology

Topological Logics over Euclidean Spaces

Great Theoretical Ideas in Computer Science. Lecture 5: Cantor s Legacy

Math 110 (S & E) Textbook: Calculus Early Transcendentals by James Stewart, 7 th Edition

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Decidability of integer multiplication and ordinal addition. Two applications of the Feferman-Vaught theory

Transcription:

Towards Efficient String Processing of Annotated Events David Woods 1 Tim Fernando 2 Carl Vogel 2 1 ADAPT Centre Trinity College Dublin, Ireland 2 Computational Linguistics Group Trinity Centre for Computing and Language Studies School of Computer Science Trinity College Dublin, Ireland ISA-13, 2017

Motivation ISO-TimeML ISO-TimeML Fragment

Motivation TLINKs TLINKs in an ISO-TimeML Document

Motivation TLINKs Examples 1 <TLINK reltype="is INCLUDED" eventinstanceid="ei1" relatedtotime="t1"/> 2 <TLINK reltype="is INCLUDED" timeid="t1" relatedtoeventinstance="ei9"/> 3 <TLINK reltype="before" eventinstanceid="ei9" relatedtoeventinstance="ei10"/>

Motivation Allen Relations Allen Relations Allen (1983, p835, Fig. 2)

Motivation Allen Relations Example Example John slept through the fire alarm last Tuesday. This sentence gives us two events, and one time period: 1 js = John slept (event) 2 fa = a fire alarm occurred (event) 3 lt = last Tuesday (time period) We can represent the information with the binary Allen Relations: js di fa js d lt

Introduction Strings as Models Strings as Models We can use strings as models to effectively represent this event data. Example John slept through the fire alarm last Tuesday. lt lt, js lt, js, fa lt, js lt

Introduction Sets as Symbols Sets as Symbols Fix a finite set A of fluents. Fluents will be understood as naming an event instance (or time) in ISO-TimeML. We encode finite sets of these fluents as symbols, which may appear in a string.

Introduction Event-Strings Event-Strings A string s = α 1 α n of subsets α i of A can be construed as a finite model consisting of n moments of time i {1,..., n}. Each α i specifies all fluents in A that hold simultaneously at i. Each α i is understood to occur chronologically before α j if and only if i < j. The powerset 2 A of A will serve as an alphabet Σ = 2 A of an event-string s Σ +.

Introduction No Time Without Change No Time Without Change But neither does time exist without change Aristotle, Physics IV

Introduction No Time Without Change No Time Without Change The precise real-time duration of each symbol is disregarded (for now). Event-strings model a kind of inertial world. Change is the only marker of progression from one moment to the next.

Superposition and Block Compression Superposition Superposition In order to usefully collect information from multiple strings into a single string, we define the operation of superposition: Definition With two strings s and s of equal length, their superposition, s & s, is their componentwise union: α 1 α n & α 1 α n := (α 1 α 1) (α n α n)

Superposition and Block Compression Superposition Box Notation For convenience of notation, we draw boxes rather than curly braces { } to represent sets of fluents in an event-string. Example With a, b, c, d A: a c & b d = a, b c, d

Superposition and Block Compression String Manipulation Stutter We can cause a string s = α 1 α n to stutter such that α i = α i+1 for some integer 0 < i < n. For example, a a a c c is a stuttering version of a c. Since the realtime duration of each box is not taken into account, the interpretation of the string is unaffected.

Superposition and Block Compression String Manipulation Block Compression We can transform a stuttering string to a stutterless string through block compression: Definition bc(s) := s if length(s) 1 bc(αs ) if s = ααs α bc(α s ) if s = αα s with α α Thus, bc( a a a c c ) = a c.

Superposition and Block Compression String Manipulation Inverse Block Compression We can generate infinitely many stuttering strings, all of which are bc-equivalent: Example bc 1 ( a c ) = { a c, a a c, a c c,...} = a + c + Precisely, a string s is bc-equivalent to a string s iff s bc 1 bc(s), and s bc 1 bc(s) iff bc(s) = bc(s ).

Superposition and Block Compression Asynchronous Superposition Asynchronous Superposition This gives our initial definition of asynchronous superposition: Definition (Initial) The asynchronous superposition of two strings s and s is the set of strings obtained by block compressing the results of superposing the strings which are bc-equivalent to s and s : s & s := {bc(s ) s bc 1 bc(s) & bc 1 bc(s )} Example a c & b d = { a, b c, d, a, b a, d c, d, a, b b, c c, d }

Superposition and Block Compression Asynchronous Superposition Upper Bound on Asynchronous Superposition We can improve this definition. It can be shown that for two strings of length n and n, the longest string produced by asynchronous superposition which has no bc-equivalent strings will be of length n + n 1. Thus, for any integer k > 0 and string s, we introduce a new operation pad k (s) which will generate the set of strings with length k which are bc-equivalent to s. Definition pad k (s) = bc 1 (s) Σ k

Superposition and Block Compression Asynchronous Superposition Upper Bound on Asynchronous Superposition An improved definition of asynchronous superposition, which puts a clear finite bound on the infinite language generated by inverse block compression: Definition (Improved) For any s, s Σ + with nonzero lengths n and n respectively, s & s = {bc(s ) s pad n+n 1(s) & pad n+n 1(s )}

Event Representation Allen Relations Bounding Boxes We use the empty box as a string of length 1 (not to be confused with the empty string ɛ, which is length 0) to bound events, allowing us to represent the fact that they are finite. Asynchronous superposition allows us to generate the 13 strings in e & e, each of which corresponds to one of the unique Allen Relations, and also one of the relation types in ISO-TimeML s TLINKs.

Event Representation Allen Relations Allen Relations as Event-Strings e = e e, e equal e s e e, e e starts e si e e, e e starts (inverse) e f e e e, e finishes e fi e e e, e finishes (inverse) e d e e e, e e during e di e e e, e e during (inverse) e o e e e, e e overlaps e oi e e e, e e overlaps (inverse) e m e e e meets e mi e e e meets (inverse) e < e e e before e > e e e after

Event Representation Allen Relations Three Unconstrained Bounded Events e & e & e = { e, e, e, e e, e, e, e e, e, e e, e, e, e e, e, e, e, e, e e, e, e e, e, e e, e, e, e e, e, e, e e, e e, e, e e, e, e e, e, e e, e, e, e e, e, e e, e, e e, e e, e, e, e, e e, e, e, e, e e, e, e e, e e, e, e, e e, e e, e, e e e, e, e e e, e, e e, e e, e, e, e e, e e, e, e, e e e, e, e e, e e e, e, e e, e e, e, e, e e, e, e e, e, e, e e, e, e e, e, e e, e e, e, e e, e, e e, e e, e e, e, e e, e, e e, e, e e, e e, e e, e, e e, e e, e e e, e, e e, e e, e, e e, e, e e, e e, e, e e, e e, e e, e, e e, e, e e, e e, e, e e, e e, e e,... }

Constraints on Event-Strings Well-formed Event-Strings Constraints How to prevent unnecessary over-generation?

Constraints on Event-Strings Well-formed Event-Strings Reduct The reduct operation will help to identify well-formed event-strings: Definition The reduct ρ X (s) for any X A and event-string s produces a componentwise intersection of s with X : ρ X (α 1 α n ) := (α 1 X ) (α n X ) Example With a, b A: bc(ρ {a} ( a a, b b )) = a

Constraints on Event-Strings Well-formed Event-Strings Well-formed Event-Strings Fluents are interval-like. Thus for any event-string s and any e A, bc(ρ {e} (s)) = e (or, if e doesn t appear in s). Relations are consistent. For example, if the relations e > e and e > e hold, then the relation e > e cannot also hold. We may discard any event-string which is not well-formed.

Constraints on Event-Strings Well-formed Event-Strings Constrained Superposition When a fluent appears in two different strings, s and s, which are to be asynchronously superposed, the number of well-formed results is usually reduced. Example The fluent b appears in both strings, yielding only one well-formed result: a b & b c = a b c Without the constraint of being well-formed, the above example would generate 270 strings, rather than 1.

Constraints on Event-Strings Multiple Events Transitivity Table Fragment before b c during c b, c c meets b c a c b, c c, before a b a b c a a, c c b, c c, a c b, c c, c a, c c b, c c, a b c a, c c b, c c during b a, b b b a, b b c c b, c a, b, c b, c c b a, b b c meets a b a b c a a, c b, c c, c a, c b, c c, a, c b, c c a b c

Constraints on Event-Strings Multiple Events Arbitrary Events No matter how many events feature in an event-string, applying the reduct ρ {e,e } and block compressing (where e and e are the events we are interested in) will give the event-string which corresponds to the Allen Relation between e and e. For example, given a > b, b > c, and c > d we can deduce a > d:

Constraints on Event-Strings Multiple Events Arbitrary Events 1 bc(ρ {a,d} ( a b & b c & c d )) 2 bc(ρ {a,d} ( a b c d )) 3 bc( a d ) 4 a d 5 a > d

Applied to ISO-TimeML Translating TLINKs Example TLINKs 1 <TLINK reltype="is INCLUDED" eventinstanceid="ei1" relatedtotime="t1"/> 2 <TLINK reltype="is INCLUDED" timeid="t1" relatedtoeventinstance="ei9"/> 3 <TLINK reltype="before" eventinstanceid="ei9" relatedtoeventinstance="ei10"/>

Applied to ISO-TimeML Translating TLINKs TLINKs as Allen Relations 1 ei1 d t1 2 t1 d ei9 3 ei9 > ei10

Applied to ISO-TimeML Translating TLINKs TLINKs as Event-Strings 1 t1 ei1, t1 t1 2 ei9 t1, ei9 ei9 3 ei9 ei10

Applied to ISO-TimeML Translating TLINKs Combining Information t1 ei1, t1 t1 & ei9 t1, ei9 ei9 & ei9 ei10 = ei9 t1, ei9 ei1, t1, ei9 t1, ei9 ei9 ei10

Applied to ISO-TimeML Translating TLINKs Extracting New Information 1 bc(ρ {ei1,ei10} ( ei9 t1, ei9 ei1, t1, ei9 t1, ei9 ei9 ei10 )) 2 bc( ei1 ei10 ) 3 ei1 ei10 4 ei1 > ei10

Further Work Deciding when to use asynchronous superposition (too many generated strings may not be worth it). Developing the framework to treat event types and include more information (durations, etc.).

Acknowledgements This research is supported by Science Foundation Ireland (SFI) through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre (https://www.adaptcentre.ie) at Trinity College Dublin. The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund. Thank you for listening!