Kernels. ffl A kernel K is a function of two objects, for example, two sentence/tree pairs (x1; y1) and (x2; y2)

Similar documents
6.891: Lecture 24 (December 8th, 2003) Kernel Methods

LR(0) Analysis. LR(0) Analysis

The second condition says that a node α of the tree has exactly n children if the arity of its label is n.

10. EXTENDING TRACTABILITY

Lie Groups HW7. Wang Shuai. November 2015

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013

Addition of angular momentum

Addition of angular momentum

Chapter 10. The singular integral Introducing S(n) and J(n)

The pn junction: 2 Current vs Voltage (IV) characteristics

Multiple Short Term Infusion Homework # 5 PHA 5127

Final Exam Solutions

First order differential equation Linear equation; Method of integrating factors

SCHUR S THEOREM REU SUMMER 2005

Pipe flow friction, small vs. big pipes

Einstein Equations for Tetrad Fields

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

Chemical Physics II. More Stat. Thermo Kinetics Protein Folding...

2008 AP Calculus BC Multiple Choice Exam

The Matrix Exponential

Section 3: Antiderivatives of Formulas

Roadmap. XML Indexing. DataGuide example. DataGuides. Strong DataGuides. Multiple DataGuides for same data. CPS Topics in Database Systems

Math 61 : Discrete Structures Final Exam Instructor: Ciprian Manolescu. You have 180 minutes.

EEO 401 Digital Signal Processing Prof. Mark Fowler

The Matrix Exponential

Additional Math (4047) Paper 2 (100 marks) y x. 2 d. d d

ECE602 Exam 1 April 5, You must show ALL of your work for full credit.

Analysis of Algorithms - Elementary graphs algorithms -

As the matrix of operator B is Hermitian so its eigenvalues must be real. It only remains to diagonalize the minor M 11 of matrix B.

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula

Analysis of Algorithms - Elementary graphs algorithms -

A Uniform Approach to Three-Valued Semantics for µ-calculus on Abstractions of Hybrid Automata

SPH4U Electric Charges and Electric Fields Mr. LoRusso

4. (5a + b) 7 & x 1 = (3x 1)log 10 4 = log (M1) [4] d = 3 [4] T 2 = 5 + = 16 or or 16.

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK

ENGR 323 BHW 15 Van Bonn 1/7

Construction of asymmetric orthogonal arrays of strength three via a replacement method

Supplementary Material for. Robust Reconstruction of Complex Networks from Sparse Data

Recall that by Theorems 10.3 and 10.4 together provide us the estimate o(n2 ), S(q) q 9, q=1

2F1120 Spektrala transformer för Media Solutions to Steiglitz, Chapter 1

Chapter Finding Small Vertex Covers. Extending the Limits of Tractability. Coping With NP-Completeness. Vertex Cover

Learning Spherical Convolution for Fast Features from 360 Imagery

Section 11.6: Directional Derivatives and the Gradient Vector

Problem solving by search

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES

The Equitable Dominating Graph

2. Finite Impulse Response Filters (FIR)

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES. 1. Statement of results

Week 3: Connected Subgraphs

Inference Methods for Stochastic Volatility Models

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012

6. The Interaction of Light and Matter

Abstract Interpretation: concrete and abstract semantics

There is an arbitrary overall complex phase that could be added to A, but since this makes no difference we set it to zero and choose A real.

Examples and applications on SSSP and MST

INTEGRATION BY PARTS

That is, we start with a general matrix: And end with a simpler matrix:

First derivative analysis

Case Study Vancomycin Answers Provided by Jeffrey Stark, Graduate Student

INC 693, 481 Dynamics System and Modelling: Linear Graph Modeling II Dr.-Ing. Sudchai Boonto Assistant Professor

Lecture 19: Free Energies in Modern Computational Statistical Thermodynamics: WHAM and Related Methods

PROBLEM SET Problem 1.

The failure of the classical mechanics

CS553 Lecture Register Allocation I 3

On the irreducibility of some polynomials in two variables

CSE 373: More on graphs; DFS and BFS. Michael Lee Wednesday, Feb 14, 2018

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

64. A Conic Section from Five Elements.

Abstract Interpretation. Lecture 5. Profs. Aiken, Barrett & Dill CS 357 Lecture 5 1

Introduction to Condensed Matter Physics

3) Use the average steady-state equation to determine the dose. Note that only 100 mg tablets of aminophylline are available here.

A Propagating Wave Packet Group Velocity Dispersion

Y 0. Standing Wave Interference between the incident & reflected waves Standing wave. A string with one end fixed on a wall

G. Gambosi (*), J. Ne~etgil (**), M. Talamo (*)

Minimum Spanning Trees

CSE 373: AVL trees. Warmup: Warmup. Interlude: Exploring the balance invariant. AVL Trees: Invariants. AVL tree invariants review

Graphs. CSC 1300 Discrete Structures Villanova University. Villanova CSC Dr Papalaskari

Basic Polyhedral theory

Modern Physics. Unit 5: Schrödinger s Equation and the Hydrogen Atom Lecture 5.6: Energy Eigenvalues of Schrödinger s Equation for the Hydrogen Atom

Laboratory work # 8 (14) EXPERIMENTAL ESTIMATION OF CRITICAL STRESSES IN STRINGER UNDER COMPRESSION

MA1506 Tutorial 2 Solutions. Question 1. (1a) 1 ) y x. e x. 1 exp (in general, Integrating factor is. ye dx. So ) (1b) e e. e c.

Combinatorial Networks Week 1, March 11-12

Homework #3. 1 x. dx. It therefore follows that a sum of the

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals.

surface of a dielectric-metal interface. It is commonly used today for discovering the ways in

NEW APPLICATIONS OF THE ABEL-LIOUVILLE FORMULA

CSC Design and Analysis of Algorithms. Example: Change-Making Problem

y cos x = cos xdx = sin x + c y = tan x + c sec x But, y = 1 when x = 0 giving c = 1. y = tan x + sec x (A1) (C4) OR y cos x = sin x + 1 [8]

Derangements and Applications

Some remarks on Kurepa s left factorial

(Upside-Down o Direct Rotation) β - Numbers

Exiting from QE. Fumio Hayashi and Junko Koeda. for presentation at SF Fed Conference. March 28, 2014

Paths. Connectivity. Euler and Hamilton Paths. Planar graphs.

Lecture 2: Discrete-Time Signals & Systems. Reza Mohammadkhani, Digital Signal Processing, 2015 University of Kurdistan eng.uok.ac.

ELECTRON-MUON SCATTERING

Problem Statement. Definitions, Equations and Helpful Hints BEAUTIFUL HOMEWORK 6 ENGR 323 PROBLEM 3-79 WOOLSEY

a b c cat CAT A B C Aa Bb Cc cat cat Lesson 1 (Part 1) Verbal lesson: Capital Letters Make The Same Sound Lesson 1 (Part 1) continued...

Last time: introduced our first computational model the DFA.

The Transfer Function. The Transfer Function. The Transfer Function. The Transfer Function. The Transfer Function. The Transfer Function

ABEL TYPE THEOREMS FOR THE WAVELET TRANSFORM THROUGH THE QUASIASYMPTOTIC BOUNDEDNESS

Transcription:

Krnls krnl K is a function of two ojcts, for xampl, two sntnc/tr pairs (x1; y1) an (x2; y2) K((x1; y1); (x2; y2)) Intuition: K((x1; y1); (x2; y2)) is a masur of th similarity (x1; y1) twn (x2; y2) an ormally: K((x1; y1); (x2; y2)) is a krnl if it can shown that thr is som fatur vctor Φ(x; y) mapping such that for all x1; y1; x2; y2 K((x1; y1); (x2; y2)) = Φ(x1; y1) Φ(x2; y2)

(Trivial) xampl of a Krnl ivn an xisting fatur vctor rprsntation Φ,fin K((x1; y1); (x2; y2)) = Φ(x1; y1) Φ(x2; y2)

K((x 1 ;y 1 ); (x 2 ;y 2 )) = (1 + Φ(x 1 ;y 1 ) Φ(x 2 ;y 2 )) 2 Mor Intrsting Krnl ivn an xisting fatur vctor rprsntation Φ, fin This can shown to an innr prouct in a nw spac Φ 0,whrΦ 0 contains all quaratic trms of Φ Mor gnrally, K((x 1 ;y 1 ); (x 2 ;y 2 )) = (1 + Φ(x 1 ;y 1 ) Φ(x 2 ;y 2 )) p can shown to an innr prouct in a nw spac Φ 0,whrΦ 0 contains all polynomial trms of Φ up to gr p Qustion: can w com up with spcializ krnls for NLP structurs?

Trs NLP Structurs S NP VP John saw NP Mary Tagg squncs,.g., nam ntity tagging S N N N S j j j j j j Napolon onapart was xil to la S = Start ntity = ontinu ntity N = Not an ntity

Φ maps a structur to a fatur vctor 2 R atur Vctors: Φ Φ fins th rprsntation of a structur S NP VP Sh announc NP NP VP a program to VP promot NP safty PP in NP NP trucks an NP vans Φ + 0; 2; 0; 0; 15; 5i h1;

aturs fatur is a function on a structur,.g., h(x) = Numr of tims is sn in x T 1 f g T 2 h c h(t 1 ) = 1 h(t 2 ) = 2

T 1 T 2 atur Vctors st of functions h1 : : : h fin a fatur vctor Φ(x) = hh1(x); h2(x) : : : h (x)i f g h c Φ(T 2 ) = h2; 0; 1; 1i Φ(T 1 ) = h1; 0; 0; 3i

ll Sutrs Rprsntation [o, 1998] ivn: Non-Trminal symols f; ; : : :g Trminal fa; ; c : : :g symols n infinit st of sutrs ::: n infinit st of faturs,.g., h3(x; y) = Numr of tims is sn in (x; y)

ll Su-fragmnts for Tagg Squncs Trminal symols fa; ; c; : : :g ivn: Stat symols fs; ; N g S S n infinit st of su-fragmnts j a S S j : : : n infinit st of faturs,.g., h3(x) = Numr of tims S j is sn in x

X Innr Proucts Φ(x) = hh1(x); h2(x) : : : h (x)i Innr prouct ( Krnl ) twn two structurs T1 an T2: Φ(T1) Φ(T2) = h i (T1)h i (T2) i=1 T2 T1 f g h c Φ(T1) = h1; 0; 0; 3i Φ(T2) = h2; 0; 1; 1i Φ(T1) Φ(T2) = 1 2 + 0 0 + 0 1 + 3 1 = 5

ll Sutrs Rprsntation ivn: Non-Trminal symols f; ; : : :g Trminal fa; ; c : : :g symols n infinit st of sutrs ::: Stp 1: hoos an (aritrary) mapping from sutrs to intgrs h i (x) = Numr of tims sutr i is sn in x Φ(x) = hh1(x); h2(x); h3(x) : : :i

Φ is now hug ll Sutrs Rprsntation ut innr prouct Φ(T 1 ) Φ(T 2 ) can comput fficintly using ynamic programming.

omputing th Innr Prouct fin N1 an N2 ar sts of nos in T1 an T2 rspctivly. I i (x) = ( if i th sutr is root at x. 1 othrwis: 0 ollows that: h i (T1) = P n 1 2N 1 I i (n1) an h i (T2) = P n 2 2N 2 I i (n2) Φ(T1) Φ(T2) = P i h i (T1)h i (T2) = P i (P n 1 2N 1 I i (n1)) ( P n 2 2N 2 I i (n2)) P P = 2N 1 n 2 2N 2 Pi I i (n1)i i (n2) n 1 = P n 1 2N 1 Pn 2 2N 2 (n1; n2) whr (n1; n2) = P i I i (n1)i i (n2) is th numr of common sutrs at n1; n2

n xampl T 1 f g T 2 h i Φ(T 1 ) Φ(T 2 ) = (; )+ (; ) :::+ (; )+ (; ) :::+ (; ) Most of ths trms ar 0 (.g. (; )). Som ar non-zro,.g. (; ) = 4

Rcursiv finition of (n1; n2) If th prouctions at n1 an n2 ar iffrnt (n1; n2) = 0 ls if n1; n2 ar pr-trminals, (n1; n2) = 1 ls 1 ) Y nc(n (n1; n2) = (1 + (ch(n1; j); ch(n2; j))) j=1 is numr of chilrn of no n1; nc(n1) j) is th j th chil of n1. ch(n1;

Illustration of th Rcursion f g h i How many sutrs o nos an hav in common? i.., What is (; )? (; ) = 4 (; ) = 1 (; ) = ( (; ) + 1) ( (; ) + 1) = 10

Th Innr Prouct for Tagg Squncs fin N1 an N2 to sts of stats in T1 an T2 rspctivly. y a similar argumnt, whr (n1; n2) is numr of common su-fragmnts at n1; n2 Φ(T1) Φ(T2) = P n 1 2N 1 Pn 2 2N 2 (n1; n2).g., T1 = j j 2 j = j a c T j j j j a 1 ) Φ(T 2 ) = (; )+ (; ) :::+ (; )+ (; ) :::+ (; ) Φ(T (; ) =.g., 4, j j

Th Rcursiv finition for Tagg Squncs fin N (n) = stat following n, W (n) = wor at stat n fin ß[W (n1); W (n2)] = 1 iff W (n1) = W (n2) Thn if lals at n1 an n2 ar th sam, (n1; n2) = (1+ß[W (n1); W (n2)]) (1+ (N (n1); N (n2)).g., T1 = j j 2 j = j a c T j j j j a (; ) = (1 + ß[a; a]) (1 + (; )) = (1 + 1) (1 + 4) = 10

Rfinmnts of th Krnls Inclu log proaility from th aslin mol: Φ(T1) is rprsntation unr all su-fragmnts krnl L(T1) is log proaility unr aslin mol Nw rprsntation Φ 0 whr Φ 0 (T1) Φ 0 (T2) = fil(t1)l(t2) + Φ(T1) Φ(T2) (inclus L(T1) as an aitional componnt with wight p fi) llows th prcptron to us original ranking as fault

X Rfinmnts of th Krnls ownwighting largr su-fragmnts SIZ i h i (T1)h i (T2) whr 0 <» 1, i=1 SIZ i is numr of stats/ruls in i th fragmnt Simpl moification to rcursiv finitions,.g., (n1; n2) = (1+ß[W (n1); W (n2)]) (1+ (N (n1); N (n2))

(n1; n2) = (1+ß[W (n1); W (n2)]) (1+ (N (n1); N (n2)) Rfinmnt of th Tagging Krnl Su-fragmnts snsitiv to splling faturs (.g., apitalization) fin ß[x; y] = 1 if x an y ar intical, y] = 0:5 if x an y shar sam capitalization faturs ß[x; N N S j j j xil to la N N S j j j xil to ap Su-fragmnts now inclu capitalization faturs N N S j j j No cap to ap N N S j j j No cap No cap ap

Parsing Wall Strt Journal xprimntal Rsults» 0 2 MOL 100 Wors (2416 sntncs) LR LP s s s O99 88.1% 88.3% 1.06 64.0% 85.1% VP 88.6% 88.9% 0.99 66.5% 86.3% VP givs 5.1% rlativ ruction in rror (O99 = my thsis parsr) Nam ntity Tagging on W ata P R Max-nt 84.4% 86.3% 85.3% Prc. 86.1% 89.1% 87.6% Improvmnt 10.9% 20.4% 15.6% VP givs 15.6% rlativ ruction in rror

Summary or any rprsntation Φ(x), fficint computation Φ(x) Φ(y) ) of fficint larning through krnl form of th prcptron ynamic programming can us to calculat Φ(x) Φ(y) unr all su-fragmnts rprsntations Svral rfinmnts of th innr proucts: Incluing proailitis from aslin mol ownwighting largr su-fragmnts Snsitivity to splling faturs