Stroke-Based Performance Metrics for Handwritten Mathematical Expressions

Similar documents
APPENDIX. Precalculus Review D.1. Real Numbers and the Real Number Line

4.5 THE FUNDAMENTAL THEOREM OF CALCULUS

If we have a function f(x) which is well-defined for some a x b, its integral over those two values is defined as

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Vidyalankar S.E. Sem. III [CMPN] Discrete Structures Prelim Question Paper Solution

GNFA GNFA GNFA GNFA GNFA

p-adic Egyptian Fractions

Suppose we want to find the area under the parabola and above the x axis, between the lines x = 2 and x = -2.

Basic Derivative Properties

Convert the NFA into DFA

Necessary and sufficient conditions for some two variable orthogonal designs in order 44

VII. The Integral. 50. Area under a Graph. y = f(x)

2.4 Linear Inequalities and Interval Notation

1B40 Practical Skills

Section 6.1 Definite Integral

QUADRATURE is an old-fashioned word that refers to

5.1 How do we Measure Distance Traveled given Velocity? Student Notes

UNIT 5 QUADRATIC FUNCTIONS Lesson 3: Creating Quadratic Equations in Two or More Variables Instruction

Section 4: Integration ECO4112F 2011

Review of Calculus, cont d

Matching patterns of line segments by eigenvector decomposition

Lecture 6. Notes. Notes. Notes. Representations Z A B and A B R. BTE Electronics Fundamentals August Bern University of Applied Sciences

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University

List all of the possible rational roots of each equation. Then find all solutions (both real and imaginary) of the equation. 1.

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

Section 6: Area, Volume, and Average Value

School of Business. Blank Page

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite

Hamiltonian Cycle in Complete Multipartite Graphs

1 ELEMENTARY ALGEBRA and GEOMETRY READINESS DIAGNOSTIC TEST PRACTICE

Lecture 3: Equivalence Relations

AQA Further Pure 1. Complex Numbers. Section 1: Introduction to Complex Numbers. The number system

Minimal DFA. minimal DFA for L starting from any other

Final Exam Review. Exam 1 Material

Parse trees, ambiguity, and Chomsky normal form

1 Online Learning and Regret Minimization

Model Reduction of Finite State Machines by Contraction

Homework 3 Solutions

Designing finite automata II

Reverse Engineering Gene Networks with Microarray Data

Calculus Module C21. Areas by Integration. Copyright This publication The Northern Alberta Institute of Technology All Rights Reserved.

Answers and Solutions to (Some Even Numbered) Suggested Exercises in Chapter 11 of Grimaldi s Discrete and Combinatorial Mathematics

Formal languages, automata, and theory of computation

Numbers and indices. 1.1 Fractions. GCSE C Example 1. Handy hint. Key point

Fast Frequent Free Tree Mining in Graph Databases

10. AREAS BETWEEN CURVES

x dx does exist, what does the answer look like? What does the answer to

Numerical Analysis: Trapezoidal and Simpson s Rule

AT100 - Introductory Algebra. Section 2.7: Inequalities. x a. x a. x < a

5.7 Improper Integrals

AN INEQUALITY OF OSTROWSKI TYPE AND ITS APPLICATIONS FOR SIMPSON S RULE AND SPECIAL MEANS. I. Fedotov and S. S. Dragomir

8. Complex Numbers. We can combine the real numbers with this new imaginary number to form the complex numbers.

Conservation Law. Chapter Goal. 6.2 Theory

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

SUPPLEMENTARY NOTES ON THE CONNECTION FORMULAE FOR THE SEMICLASSICAL APPROXIMATION

Surface maps into free groups

Torsion in Groups of Integral Triangles

Recitation 3: More Applications of the Derivative

CS 330 Formal Methods and Models

Lecture 3. Introduction digital logic. Notes. Notes. Notes. Representations. February Bern University of Applied Sciences.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

Lecture 2 : Propositions DRAFT

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

8Similarity UNCORRECTED PAGE PROOFS. 8.1 Kick off with CAS 8.2 Similar objects 8.3 Linear scale factors. 8.4 Area and volume scale factors 8.

Physics Lecture 14: MON 29 SEP

The area under the graph of f and above the x-axis between a and b is denoted by. f(x) dx. π O

First Midterm Examination

Situation Calculus. Situation Calculus Building Blocks. Sheila McIlraith, CSC384, University of Toronto, Winter Situations Fluents Actions

8Similarity ONLINE PAGE PROOFS. 8.1 Kick off with CAS 8.2 Similar objects 8.3 Linear scale factors. 8.4 Area and volume scale factors 8.

Designing Information Devices and Systems I Spring 2018 Homework 7

2 b. , a. area is S= 2π xds. Again, understand where these formulas came from (pages ).

Main topics for the First Midterm

INTEGRALS. Chapter Introduction

4.4 Areas, Integrals and Antiderivatives

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Section 6.3 The Fundamental Theorem, Part I

Date Lesson Text TOPIC Homework. Solving for Obtuse Angles QUIZ ( ) More Trig Word Problems QUIZ ( )

Parsing and Pattern Recognition

Lecture 3: Curves in Calculus. Table of contents

1 Nondeterministic Finite Automata

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

Chapter 9 Definite Integrals

Lecture 6: Coding theory

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

The Regulated and Riemann Integrals

Mathematics Number: Logarithms

AP Calculus AB First Semester Final Review

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

D. Harel, Statecharts: A visual formalism for complex systems, Science of Computer Programming 8, 1987, pp

5: The Definite Integral

Formal Languages and Automata

Mass Creation from Extra Dimensions

Harvard University Computer Science 121 Midterm October 23, 2012

Math& 152 Section Integration by Parts

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

SOME INTEGRAL INEQUALITIES OF GRÜSS TYPE

0.1 THE REAL NUMBER LINE AND ORDER

Transcription:

2011 Interntionl Conference on ocument Anlysis n Recognition Stroke-Bse Performnce Metrics for Hnwritten Mthemticl Expressions Richr Znii rlz@cs.rit.eu Amit Pilly p2731@rit.eu eprtment of Computer Science Rochester Institute of Technology, NY, SA Hrol Mouchère hrol.mouchere@univ-nntes.fr Christin Vir-Guin christin.vir-guin@univ-nntes.fr LNAM, niversité e Nntes IRCCyN, Frnce orothe Blostein lostein@cs.queensu.c School of Computing, Queen s niversity, Cn Astrct Evluting mthemticl expression recognition involves complex interction of input primitives (e.g. pen/finger strokes), recognize symols, n recognize sptil structure. Existing performnce metrics simplify this prolem y seprting the ssessment of sptil structure from the ssessment of symol segmenttion n clssifiction. These metrics o not chrcterize the overll ccurcy of pense mthemtics recognition, mking it ifficult to compre mth recognition lgorithms, n preventing the use of mchine lerning lgorithms requiring criterion function chrcterizing overll system performnce. To ress this prolem, we introuce performnce metrics tht rige the gp from hnwritten strokes to sptil structure. Our metrics re compute using iprtite grphs tht represent clssifiction, segmenttion n sptil structure t the stroke level. Overll correctness of n expression is mesure y counting the numer of relelings of noes n eges neee to mke the iprtite grph for recognition result mtch the iprtite grph for groun truth. This metric my lso e use with other primitive types (e.g. imge pixels). Keywors-Performnce Evlution; Mth Recognition; Hnwriting Recognition; Grphics Recognition I. INTROCTION Evluting the performnce of ocument nlysis systems is n importnt n ifficult prolem. As recently summrize y Silv [1], much of the ifficulty stems from iversity in gols, input types, input omins, concepts, output grnulrity, evlution moments, n evlution metrics. Our work focuses on ressing performnce evlution issues tht rise ue to iversity in grnulrity. Performnce metrics re simpler to efine for computtions in which input n output items hve similr grnulrity. In the cse of mth recognition, the grnulrity iffers mrkely, with sptil rrngement of strokes or symols s input, n hierrchicl lyout escription (e.g. L A TEX, see Figure 1) n/or representtion of mening (e.g. Content MthML, OpenMth) s output. There is nee for stnr metrics tht permit meningful comprison of mth recognition results [2], [3], oth for compring systems, n for use with mchine lerning lgorithms tht optimize system performnce. Our primry contriution is iprtite grph-se representtion of expression structure t the level of primitives (e.g. strokes), tht cptures errors in oth recognize symols n lyout. We were motivte to use representtion t the primitive rther thn symol level, ecuse the istinction etween symol segmenttion n structure recognition is sometimes lurre. For exmple, the configurtion = cn e viewe either s single symol consisting of two sptilly seprte strokes, or s two symols (short horizontl lines) with one top nother. We consier isolte expressions in this pper, ut our metho my e pte in stright-forwr mnner for multiple expressions, flowchrts, tles, n even imges using pixel regions such s connecte components or imge ptches. The secon contriution of the pper is set of new error metrics se in our iprtite representtion. This inclues metrics tht chrcterize overll recognition ccurcy, proviing criterion function for expressions, se on strokes s primitives; existing criterion functions use symols s primitives, s iscusse in Section II. We ssume tht it is possile to efine groun truth interprettion for given set of strokes. This exclues unersegmente strokes, in which single stroke is use to prouce two items tht must e represente seprtely in \frc{}{ˆ} ) Input ) L A TEX c) Symol Lyout Tree (: up, : own, S: superscript) Figure 1. Hnwritten Expression Contining Five Strokes n Four Symols. A LATEX string () or equivlent Symol Lyout Tree (c) my e use to represent symol rrngement 1520-5363/11 $26.00 2011 IEEE OI 10.1109/ICAR.2011.75 334

the groun truth interprettion (e.g. for cursive x written using single stroke, ut contining two symols). For etile performnce nlysis, ifferent metrics re neee to chrcterize ccurcy for specific tsks: segmenttion, clssifiction, n prsing. However, in some situtions single vlue chrcterizing performnce is neee, such s when using mchine lerning lgorithms to optimize system performnce s whole. Thus, in this pper, we iscuss severl component metrics (section IV) n lso iscuss methos of comining these into single overll estimte of performnce (Section V). II. PREVIOS WORK IN EVALATING MATH RECOGNITION Mthemticl expression recognition is n ctive reserch fiel, oth for on-line n off-line t [4], [5], [6], Grounthruthe tset re now ville (e.g. [7] (off-line) n [8] (online)). As pointe out y oth Lpointe n Blostein [2] n Awl et l. [3], the mth recognition omin now nees stnr evlution metrics to support comprisons of existing n newly evelope systems. Most existing pproches to evluting mth recognition compute istnce etween the recognize expression n the groun truth, ccoring to ifferent spects. The expression recognition rte is common, ut glol n reltively uninformtive, s it counts only expressions tht precisely mtch groun truth. Symol recognition rte oes not consier symol lyout; Bseline recognition checks only if symols pper on the correct seline reltive to symol. Some metrics, such s the verge performnce inex [9] weight errors epening on the epth of nesting for selines in n expression. A ifficulty in efining n ccurcy metric for symol lyout in mth, is tht the tree-se representtion neee to represent symol lyout is unsuitle for use with clssicl metrics use for text recognition, such s the Levenstein eit istnce. One solution is to use tree-eit istnce, ut this is not use in prctice, ecuse of the NP complexity of the existing lgorithms to mtch oth tree eges n noes. An interesting solution y Grin et l. [10] proposes to trnsform the tree into token string which llows one to use eit istnce. The rwck of this pproch is tht it looses some of the eit opertions offere y tree eits (like swpping chilren of noe), leing them to incur high cost in the string-se representtion. Our min contriution is iprtite grph representtion of expression structure t the level of strokes, from which metrics se on Hmming istnces my e simply efine, with n intuitive interprettion. Given set of input primitives for test expression, our representtion prevents the nee to mtch iniviul strokes, so tht only stroke lels n lyout reltionships etween strokes nee to e mtche, s escrie in the next section. III. EXPRESSION REPRESENTATION In our pproch, the recognizer output n groun truth interprettions must first e converte into iprtite grphs. This is illustrte in Figures 2 n 3. The iprtite representtion is shown in Figure 3), where the noes of the grph represent ech stroke in the expression twice: s n unlele input stroke (t left), n with n ssigne symol lel n etecte reltionships (t right, with sptil reltionships shown s incoming eges). Note tht there re N(N 1) eges in this grph, where N is the numer of strokes, n we omit eges from strokes to themselves. For legiility, eges representing no reltionship re not rwn. This iprtite grph is constructe from symol lyout tree: strokes in symol noes re split into seprte stroke noes (see Figure 2), with ech stroke possessing the sptil reltionships of its ssocite symol. All stroke noes inherit the sptil reltionships of their ncestors in the lyout tree. In Figure 2, the two strokes corresponing to the inherit the own sptil reltionship of the singlestroke. Note tht this inheritnce pplies to ll sptil reltionships, incluing continment y squre roots, n horizontl jcency: for exmple, in k + m, m inherits the Right reltionship of the + reltive to the k. The informtion presente in the AG in Figure 2) cn then e converte irectly into iprtite grph, s shown in Figure 3). Note tht strokes with the sme set of incoming sptil reltionships in Figure 3 re symols, i.e. stroke reltionships inuce segmenttion of strokes into symols. It is esier to visulize ifferences in interprettions using the iprtite representtion (s noe positions in the grph my e fixe cross interprettions), ut it is esier to visulize interprete lyout from the AG representtion. To evlute symol segmenttion seprtely from lyout, we my use secon iprtite grph: see Figure 3). An unirecte ege is plce etween ll pirs of non-ienticl strokes elonging to symol. Thus symol compose of 3 strokes is represente y 6 eges; one isolte stroke corresponing to symol is not connecte. ) Stroke Lyout Tree ) Stroke Lyout AG Figure 2. Stroke-level Groun Truth Representtions for Expression in Figure 1). Strokes re lele using s<num>. In ), the stroke lyout tree is converte to AG y ing incoming sptil reltionships t noe to ll of its escenents. Sptil reltionships re represente y : up, : own, : superscript, Su: suscript, n R: right 335

Δ(E 1,E 1 ) = 0, n the tringle inequlity: Δ(E 1,E 3 ) Δ(E 1,E 2 )+Δ(E 2,E 3 ). Clssifiction (Δ C ): The numer of strokes with ifferent symol lels in the expression grphs E 1 n E 2 : Δ C (E 1,E 2 )= {s S l(s, E 1 ) l(s, E 2 )} (1) Lyout (Δ L ): Let L 1 n L 2 e the set of lelle eges in expression grphs E 1 n E 2. Lyout isgreement is the numer of isgreeing ege lels etween non-ienticl strokes: ) Stroke lels n lyout ) Segmenttion Figure 3. Biprtite Grphs Representing the Expression in Figure 1. Noes represent strokes, n lele eges represent sptil reltionships. In ), symol clsses re shown using noe lels, n sptil reltionships using ege lels. This grph represents the sme informtion s in Figure 2). In ) segmenttion grph is shown, in which strokes elonging to symol re connecte IV. METRICS FOR SPECIFIC ERROR TYPES Given two iprtite grphs representing recognizer output n groun truth for strokes in hnwritten mth expression, recognition errors my e foun irectly s isgreeing noe or ege lels. A numer of exmples re provie in Figure 4; incorrect lels n reltionships reltive to groun truth re shown in re. ) mislele stroke (1 error, Δ C ) ) misrecognize reltionships (2 errors, Δ L ) c) segmenttion error, where the hs een split into two symols. There re two mislele strokes (clssifiction errors), n spurious sptil reltionship etween n (3 errors) ) error similr to tht in c), ut with the stem of the misrecognize s eing ove the frction line; there is lso missing reltionship etween n (5 errors) These errors re summrize in Tle I. Aitionlly, one my count the numer of isgreeing stroke pirings in segmenttion grphs s illustrte in Figure 3) (Δ S ). In Figure 4c) n ), two segmenttion grph eges from groun truth re missing. For set of strokes S n two expressions efine on S represente y iprtite grphs E 1 n E 2, we efine the metrics elow for specific stroke properties. Let e the set of ll non-ienticl stroke pirs: = {(p, q) S S p q}, where = S S 1. Ech metric elow is Hmming istnce, specificlly the numer of isgreeing lels/reltionships. As such, ech stisfies the four requirements for metric [11]: non-negtivity, symmetry (Δ(E 1,E 2 ) = Δ(E 2,E 1 ), Δ L (E 1,E 2 )= L 1 L 2 (2) Segmenttion (Δ S ): This is efine similrly to lyout, ut using unirecte segmenttion iprtite grphs (see Figure 3) B 1 n B 2 constructe on the set of strokes for ech symol reltion tree. Δ S (E 1,E 2 )= B 1 B 2 (3) V. EXPRESSION-LEVEL ISTANCE METRICS We now comine our metrics for specific error types (clssifiction, segmenttion, n lyout) into Expression- Level istnce Metrics tht efine single istnce mesure for two interprettions of n expression. First consier the istnce metric Δ B [0, 1] efine s the numer of isgreeing stroke lels n sptil reltionships, such s shown in the thumnil imges of Figure 4. This is Hmming istnce, with S 2 elements in ech vector of noe/ege lels for grph. Δ B (E 1,E 2 )= Δ C +Δ L S 2 (4) This metric is unweighte, n s result will prouce less istnce for clssifiction errors thn errors in lyout n segmenttion (represente implicitly in the lyout reltionships). As n solute mesure of the ifference etween two iprtite grphs Δ B is sufficient, ut one my wnt to weight errors to mke clssifiction errors proportionl to segmenttion n lyout errors. In prticulr, when compring lgorithms for use in prctice, or when using mchine lerning to optimize the complete recognition system, one my wnt to weight the ifferent error types. We efine metric Δ E [0, 1] s the verge per-stroke clssifiction, segmenttion n lyout errors: Δ C (E 1,E 2 ) S + Δ S (E 1,E 2 ) + Δ L (E 1,E 2 ) Δ E(E 1,E 2)= 3 (5) We use the squre root of the segmenttion n sptil reltionship istnces in orer to mke them proportionl to S rther thn S 2 (one coul inste ivie Δ L n Δ S 336

6 R R ) Clssifiction Error ( 6) ) Lyout Error ( R) 0 R 1 1 01 0 Su 1 0 c) Clssifiction n Segmenttion ( {0,1}) ) Clssifiction, Segmenttion n Lyout Figure 4. Exmple Recognition Errors for Expression in Figure 1. In the AGs errors re shown in re, n y fille noes n eges in the iprtite grphs. In the iprtite grph thumnils, noes correspon to strokes s shown in Figure 3. In prt ) there re five errors: the hs een seprte into two mis-clssifie strokes, with two spurious sptil reltionships ( n Su), n one missing reltionship (the superscript etween the n the verticl line in the ). Tle I ISTANCE BETWEEN EXPRESSIONS IN FIGRE 4 AN GRON-TRTH IN FIGRES 2B) AN 3A) Fig. 4 Δ C Δ S Δ L Δ B Δ E ) 1 0 0 0.04 0.067 ) 0 0 2 0.08 0.105 c) 2 2 1 0.12 0.313 ) 2 2 3 0.2 0.368 y S 1 for the sme reson). This prevents ifferences in segments n sptil reltionships from eing weighte less hevily thn ifferences in stroke (symol) clssifiction lels. As ech component istnce is in [0, 1], Δ E lso lies in the intervl [0, 1]. Δ B n Δ E re proper metrics. They re non-negtive, symmetric, n the istnce from lyout tree to itself is 0. As the squre root of non-negtive vlues is n orerpreserving monotonic function, the squre root of metric is lso metric. Given tht Δ C, Δ S n Δ L re proper metrics, their sum oeys the tringle inequlity y efinition. Similrly, using the verge of their sum oes not invlite the metric property. Both Δ B n Δ E require O( S 2 ) time to compute. In prctice, S tens to e reltively smll, n so the qurtic complexity is not significnt concern. Further, sent sptil reltionships nee never e explicitly compre: we cn simply count lels n reltionships present in t lest one of the two input grphs. In orer to illustrte n compre these metrics, Tle I shows these five istnces etween errors n its grountruth. Notice tht clssifiction errors re weighte more hevily, n tht in generl the compute istnce/error vlue is higher for Δ E thn Δ B. VI. GENERALIZATION: SEGMENTS AN PIXELS We hve ssume tht no stroke correspons to more thn one symol in the input (i.e. no stroke is uner-segmente). This ssumption my e remove if we use finer-grine primitives, such s line segments rther thn whole strokes. A single stroke contining the two symols x cn then e prtitione, n the resulting segmenttion evlute. ocument imges often hve some symols overlpping within single connecte component, such s in: y x, where the frction line n y my intersect. In this cse we cn econstruct connecte components into smller sucomponents tht correspon to smll contiguous regions, or s more extreme pproch, tking pixels to e the primitives. sing the smllest possile primitives (e.g. pixels) is ttrctive ecuse uner-segmenttion cnnot occur; however, efficiency my ecome prolem, s the iprtite grphs/ags woul e very lrge. Pixel-level groun truth is imprecise; however, this level of groun truthing is common in computer vision, where it is unerstoo tht the humn interprettion involve in constructing groun truth results in resiul errors for ecisions within miguous regions (e.g. 337

ientifying the specific split point etween two connecte symols rwn with single stroke (e.g. x)). With pproprite primitives, the metrics presente my e use s criterion functions for mchine lerning lgorithms. In most cses, losses for errors in stroke leling n reltionships will nee to e softene to vlues in [0, 1] rther thn {0, 1}, e.g. to voi iscontinuities in the error surfce when using lgorithms se on grient escent. These soft errors my e otine using itionl metrics for stroke lels n reltionships (e.g. proilities or fuzzy vlues). VII. CONCLSION We hve presente new metrics for compring the similrity of two interprettions of set of online strokes, with ppliction to pen-se mthemtics recognition. Our pproch is novel in tht it uses strokes rther thn symols s the sis for compring symol n structure recognition results. This hs the vntge of proviing roer chrcteriztion of system performnce, llowing expressionlevel performnce to e ssesse in terms of input primitives. Our metrics cn e efficiently compute, in time O(n 2 ), where n is the numer of strokes. Note tht hnwritten expressions typiclly consist of reltively smll numer of strokes. The pproch cn lso e esily pte other pen-se omins, such s recognition of flowchrts, n for use in imges. An open question is whether the metric cn e usefully pplie when one cnnot ssume tht the sets primitives for two recognition results eing compre mtch (e.g. for Mthemticl Informtion Retrievl (MIR) pplictions). A relte issue is efining metrics for evlution of mthemticl content (i.e. mthemticl syntx of recognize expression); the metho presente in this pper resses only evlution of lyout. As mthemticl content is normlly represente hierrchiclly y opertor trees, it my e possile to employ iprtite grph-se pproch to evlution there s well, gin using input primitives s the noes in the grph. [3] A.-M. Awl, H. Mouchere, n C. Vir-Guin, The prolem of hnwritten mthemticl expression recognition evlution, in Int l Conf. on Frontiers in Hnwriting Recognition, Kolkt, Ini, 2010, pp. 646 651. [4]. Blostein n A. Grvec, Recognition of mthemticl nottion, in Hnook of Chrcter Recognition n ocument Imge Anlysis. Worl Scientific Pulishing Compny, 1997, pp. 557 582. [5] K.-F. Chn n.-y. Yeung, Mthemticl expression recognition: survey, Interntionl Journl on ocument Anlysis n Recognition, vol. 3, pp. 3 15, Aug 2000. [6]. Grin n B. Chuhuri, OCR of Printe Mthemticl Expressions. Springer, 2007, pp. 235 259. [7] S. chi, A. Nomur, n M. Suzuki, Quntittive nlysis of mthemticl ocuments, Int l J. ocument Anlysis n Recognition, vol. 7, no. 4, pp. 211 218, 2005. [8] S. McLen, G. Lhn, E. Lnk, M. Mrzouk, n. Tusky, Grmmr-se techniques for creting grountruthe sketch corpor, Int l. J. ocument Anlysis n Recognition, vol. 14, no. 1, pp. 65 74, 2011. [9]. Grin n B. Chuhuri, A corpus for OCR reserch on mthemticl expressions, Int l J. ocument Anlysis n Recognition, vol. 7, no. 4, pp. 241 259, 2005. [10] K. Sin, A. sgupt, n. Grin, Emers: tree mtching-se performnce evlution of mthemticl expression recognition systems, Int l J. ocument Anlysis n Recognition, no. 14, pp. 75 85, 2011. [11] R. u, P. Hrt, n. Stork, Pttern Clssifiction, 2n e. Wiley, 2001. Acknowlegements: This mteril is se upon work supporte y the Ntionl Science Fountion uner Grnt No. IIS-1016815, the Nturl Sciences n Engineering Reserch Council of Cn, the Xerox Fountion, n the Center for Emerging n Innovtive Sciences (NYSTAR). REFERENCES [1] A. Sliv, Metrics for evluting performnce in ocument nlysis: ppliction to tles, Int l J. ocument Anlysis n Recognition, vol. 14, pp. 101 109, 2011. [2] A. Lpointe n. Blostein, Issues in performnce evlution: A cse stuy of mth recognition. IEEE Computer Society, 2009, pp. 1355 1359. 338