Automated Summarisation for Evidence Based Medicine

Similar documents
Text Summarisation for Evidence Based Medicine

Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data

CROSS SECTIONAL STUDY & SAMPLING METHOD

Evaluation Strategies

CLRG Biocreative V

ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database

Applying Mathematical Processes. Symmetry. Resources. The Clothworkers Foundation

Sampling Strategies to Evaluate the Performance of Unknown Predictors

Formal Ontology Construction from Texts

Robert Edgar. Independent scientist

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht

Harmonisation of product notification. Ronald de Groot Dutch Poisons Information Center

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

CS246 Final Exam, Winter 2011

CS 543 Page 1 John E. Boon, Jr.

Using C-OWL for the Alignment and Merging of Medical Ontologies

Part A. P (w 1 )P (w 2 w 1 )P (w 3 w 1 w 2 ) P (w M w 1 w 2 w M 1 ) P (w 1 )P (w 2 w 1 )P (w 3 w 2 ) P (w M w M 1 )

EVALUATING RISK FACTORS OF BEING OBESE, BY USING ID3 ALGORITHM IN WEKA SOFTWARE

A Support Vector Method for Multivariate Performance Measures

Medical Question Answering for Clinical Decision Support

Programme Specification MSc in Cancer Chemistry

European harmonisation of product notification- Status 2014

Test and Evaluation of an Electronic Database Selection Expert System

Predictive Analytics on Accident Data Using Rule Based and Discriminative Classifiers

Instructions for NLP Practical (Units of Assessment) SVM-based Sentiment Detection of Reviews (Part 2)

Agilent MassHunter Quantitative Data Analysis

1 Information retrieval fundamentals

Boolean and Vector Space Retrieval Models CS 290N Some of slides from R. Mooney (UTexas), J. Ghosh (UT ECE), D. Lee (USTHK).

Machine Learning (CS 567) Lecture 2

Decision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1

Investigating Factors that Influence Climate

A SAS/AF Application For Sample Size And Power Determination

Course Descriptions Biology

More Smoothing, Tuning, and Evaluation

Induction of Decision Trees

Course plan Academic Year Qualification MSc on Bioinformatics for Health Sciences. Subject name: Computational Systems Biology Code: 30180

COMS F18 Homework 3 (due October 29, 2018)

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

The Lady Tasting Tea. How to deal with multiple testing. Need to explore many models. More Predictive Modeling

Analysis of 2x2 Cross-Over Designs using T-Tests

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News:

Principal Moderator s Report

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

Association Rule Mining on Web

Models, Data, Learning Problems

Topic Models and Applications to Short Documents

The Purpose of Hypothesis Testing

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Lassen Community College Course Outline

Lecture 10: Comparing two populations: proportions

A Study of Biomedical Concept Identification: MetaMap vs. People

Evaluation. Brian Thompson slides by Philipp Koehn. 25 September 2018

Nuclear Data Activities in the IAEA-NDS

Compiled by the Queensland Studies Authority

Special Topics in Computer Science

6 Single Sample Methods for a Location Parameter

SABIO-RK Integration and Curation of Reaction Kinetics Data Ulrike Wittig

Part I: Web Structure Mining Chapter 1: Information Retrieval and Web Search

Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective

Modern Information Retrieval

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size

Chemistry with Spanish for Science

SciFinder Scholar Guide to Getting Started

Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee

Applied Natural Language Processing

A few words on white space control

Analysing longitudinal data when the visit times are informative

MassHunter TOF/QTOF Users Meeting

An Introduction to GLIF

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

STRANDS Major areas of study that include Life Science, Earth and Space Science, and Science as Inquiry

Bayesian Learning. Examples. Conditional Probability. Two Roles for Bayesian Methods. Prior Probability and Random Variables. The Chain Rule P (B)

Presentation of the Cooperation Project goals. Nicola Ferrè

Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach

STRANDS BENCHMARKS GRADE-LEVEL EXPECTATIONS. Biology EOC Assessment Structure

ML (cont.): SUPPORT VECTOR MACHINES

Processing/Speech, NLP and the Web

Physics 2020 Laboratory Manual

Q/A-LIST FOR THE SUBMISSION OF VARIATIONS ACCORDING TO COMMISSION REGULATION (EC) 1234/2008

ECLT 5810 Data Preprocessing. Prof. Wai Lam

What s Cooking? Predicting Cuisines from Recipe Ingredients

Learning situation-specific knowledge for solving situation-specific problems. A. Aamodt, NTNU,

MSc Drug Design. Module Structure: (15 credits each) Lectures and Tutorials Assessment: 50% coursework, 50% unseen examination.

Joint Research Centre

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Using Decision Trees to Evaluate the Impact of Title 1 Programs in the Windham School District

Machine Learning for NLP

Programme Specification (Undergraduate) MSci Physics

CSE 258, Winter 2017: Midterm

Practical APLIS-based Structured/Synoptic Reporting

Web Search and Text Mining. Learning from Preference Data

Q learning. A data analysis method for constructing adaptive interventions

Performance Evaluation

CSC 411: Lecture 03: Linear Classification

A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs

Click Prediction and Preference Ranking of RSS Feeds

Decision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

Transcription:

Automated Summarisation for Evidence Based Medicine Diego Mollá Centre for Language Technology, Macquarie University HAIL, 22 March 2012

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 2/60

About Us: Research Group on Natural Language Processing of Medical Texts http://web.science.mq.edu.au/~diego/medicalnlp/ Active Members Diego Mollá Senior lecturer at Macquarie University. Cécile Paris Senior principal research scientist at CSIRO ICT Centre. Abeed Sarker PhD student at Macquarie University. Sara Faisal Shash Masters student. Past Members María Elena Santiago-Martínez Research programmer. Patrick Davis-Desmond Masters student. Andreea Tutos Masters student. EBM Summarisation Diego Mollá 3/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 4/60

Evidence Based Medicine http://laikaspoetnik.wordpress.com/2009/04/04/evidence-based-medicine-the-facebook-of-medicine/ EBM Summarisation Diego Mollá 5/60

EBM and Natural Language Processing http://hlwiki.slais.ubc.ca/index.php?title=five_steps_of_ebm EBM Summarisation Diego Mollá 6/60

PICO for Asking the Right Question EBM Summarisation Diego Mollá 7/60

Where to search for external evidence? 1. Evidence-based Summaries (Systematic Reviews): EBM Online (http://ebm.bmj.com). UptoDate (http://www.uptodate.com). The Cochrane Library (http://www.thecochranelibrary.com/).... EBM Summarisation Diego Mollá 8/60

Where to search for external evidence? 1. Evidence-based Summaries (Systematic Reviews): EBM Online (http://ebm.bmj.com). UptoDate (http://www.uptodate.com). The Cochrane Library (http://www.thecochranelibrary.com/).... 2. Search the Medical Literature: E.g. PubMed (http://www.ncbi.nlm.nih.gov/pubmed/). EBM Summarisation Diego Mollá 8/60

Searching Cochrane EBM Summarisation Diego Mollá 9/60

Searching PubMed EBM Summarisation Diego Mollá 10/60

Searching the Trip Database EBM Summarisation Diego Mollá 11/60

Appraising the Evidence The SORT Taxonomy Level A Consistent and good-quality patient-oriented evidence. Level B Inconsistent or limited-quality patient-oriented evidence. Level C Consensus, usual practise, opinion, disease-oriented evidence, or case series for studies of diagnosis, treatment, prevention, or screening. EBM Summarisation Diego Mollá 12/60

Levels of Evidence Study quality Diagnosis Treatment / prevention / screening Prognosis Level 1: good-quality patient-oriented evidence Validated clinical decision rule; SR/meta-analysis of high-quality studies; highquality diagnostic cohort study SR/meta-analysis of RCTs with consistent findings; high-quality individual RCT; all-or-none study SR/meta-analysis of goodquality cohort studies; prospective cohort study with good follow-up Level 2: limited-quality patient-oriented evidence Unvalidated clinical decision rule; SR/metaanalysis of lower-quality studies or studies with inconsistent findings; lower-quality diagnostic cohort study or diagnostic case-control study SR/meta-analysis of lowerquality clinical trials or of studies with inconsistent findings; lower-quality clinical trial; cohort study; case-control study SR/meta-analysis of lowerquality cohort studies or with inconsistent results; retrospective cohort study or prospective cohort study with poor follow-up; casecontrol study; case series Level 3: evidence other Consensus guidelines, extrapolations from bench research, usual practice, opinion, disease-oriented evidence (intermediate or physiologic outcomes only), or case series for studies of diagnosis, treatment, prevention, or screening EBM Summarisation Diego Mollá 13/60

Where can NLP Help? Questions: Help to formulate answerable questions. Question analysis and classification. EBM Summarisation Diego Mollá 14/60

Where can NLP Help? Questions: Help to formulate answerable questions. Question analysis and classification. Search: Retrieve and rank relevant literature. Extract the evidence-based information. Summarise the results. EBM Summarisation Diego Mollá 14/60

Where can NLP Help? Questions: Help to formulate answerable questions. Question analysis and classification. Search: Retrieve and rank relevant literature. Extract the evidence-based information. Summarise the results. Appraisal: Classify the evidence. EBM Summarisation Diego Mollá 14/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 15/60

Where s the Corpus for Summarisation? Summarisation Systems CENTRIFUSER/PERSIVAL: Developed and tested using user feedback (iterative design). SemRep: Evaluation based on human judgement. Demner-Fushman & Lin: ROUGE on original paper abstracts. Fiszman: Factoid-based evaluation. EBM Summarisation Diego Mollá 16/60

Where s the Corpus for Summarisation? Summarisation Systems CENTRIFUSER/PERSIVAL: Developed and tested using user feedback (iterative design). SemRep: Evaluation based on human judgement. Demner-Fushman & Lin: ROUGE on original paper abstracts. Fiszman: Factoid-based evaluation. Corpora Several corpora of questions/answers available. Answers lack explicit pointers to primary literature. Medical doctors want to know the primary sources. EBM Summarisation Diego Mollá 16/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 17/60

Journal of Family Practice s Clinical Inquiries EBM Summarisation Diego Mollá 18/60

The XML Contents I <u r l >h t t p : / /www. j f p o n l i n e. com/ Pages. asp?aid=7843&amp ; i s s u e=september 2009&amp ; UID=</u r l > <record id = 7843 > <question>which treatments work best f o r hemorrhoids?</question> <answer> <s n i p i d = 1 > <s n i p t e x t >E x c i s i o n i s t h e most e f f e c t i v e t r e a t m e n t f o r thrombosed e x t e r n a l h e m o r r h o i d s.</ s n i p t e x t > <s o r t y p e= B > r e t r o s p e c t i v e s t u d i e s </sor> <l o n g i d = 1 1 > <l o n g t e x t >A r e t r o s p e c t i v e s t u d y o f 231 p a t i e n t s t r e a t e d c o n s e r v a t i v e l y o r s u r g i c a l l y found t h a t t h e 48.5% o f p a t i e n t s t r e a t e d s u r g i c a l l y had a l o w e r r e c u r r e n c e r a t e than t h e c o n s e r v a t i v e group ( number needed to t r e a t [NNT]=2 f o r recurrence at mean follow up of 7. 6 months ) and e a r l i e r r e s o l u t i o n o f symptoms ( a v e r a g e 3. 9 days compared w i t h 24 days f o r c o n s e r v a t i v e t r e a t m e n t ).</ l o n g t e x t > <r e f i d = 15486746 a b s t r a c t = A b s t r a c t s /15486746. xml >Greenspon J, W i l l i a m s SB, Young HA, e t a l. Thrombosed e x t e r n a l h e m o r r h o i d s : outcome a f t e r c o n s e r v a t i v e o r s u r g i c a l management. Dis Colon Rectum. 2004; 47: 1493 1498.</ ref> </long> <l o n g i d = 1 2 > <l o n g t e x t >A r e t r o s p e c t i v e a n a l y s i s o f 340 p a t i e n t s who underwent o u t p a t i e n t e x c i s i o n o f thrombosed e x t e r n a l h e m o r r h o i d s under l o c a l a n e s t h e s i a r e p o r t e d a low r e c u r r e n c e r a t e o f 6.5% a t a EBM Summarisation Diego Mollá 19/60

The XML Contents II mean f o l l o w up o f 1 7.3 months.</ l o n g t e x t > <r e f i d = 12972967 a b s t r a c t = A b s t r a c t s /12972967. xml >Jongen J, Bach S, S t u b i n g e r SH, e t a l. E x c i s i o n o f thrombosed e x t e r n a l h e m o r r h o i d s under l o c a l a n e s t h e s i a : a r e t r o s p e c t i v e e v a l u a t i o n of 340 p a t i e n t s. Dis Colon Rectum. 2003; 46: 1226 1231.</ ref> </long> <l o n g i d = 1 3 > <l o n g t e x t >A p r o s p e c t i v e, randomized c o n t r o l l e d t r i a l (RCT) o f 98 p a t i e n t s t r e a t e d n o n s u r g i c a l l y found improved p a i n r e l i e f w i t h a c o m b i n a t i o n o f t o p i c a l n i f e d i p i n e 0.3% and l i d o c a i n e 1.5% compared w i t h l i d o c a i n e a l o n e. The NNT f o r complete p a i n r e l i e f a t 7 days was 3.</ l o n g t e x t > <r e f i d = 11289288 a b s t r a c t = A b s t r a c t s /11289288. xml > P e r r o t t i P, Antropoli C, Molino D, et a l. Conservative treatment of acute thrombosed e x t e r n a l h e m o r r h o i d s w i t h t o p i c a l n i f e d i p i n e. Dis Colon Rectum. 2001; 44: 405 409.</ ref> </long> </s n i p> </answer> </r e c o r d> EBM Summarisation Diego Mollá 20/60

Components of the Corpus Question direct extract from the source. Answer split from the source and manually checked. Evidence extracted from the source. Additional text manually extracted from the source and massaged. References PMID looked up in PubMed (automatic and manual procedure). EBM Summarisation Diego Mollá 21/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 22/60

Annotation of Text Justifications Goal Identify the text justifications. Align the text justifications with the answer parts. Method Three annotators (members of the research group). Annotation tool contains pre-zoned text: answer summary; body text; recommendations; references. Annotators need to copy and paste (and massage) the text. EBM Summarisation Diego Mollá 23/60

Annotation Tool I EBM Summarisation Diego Mollá 24/60

Annotation Tool II EBM Summarisation Diego Mollá 25/60

Annotating Answer Justifications Conventions for text massaging 1. Remove/edit connecting phrases. 2. Remove irrelevant introductory text. 3. If a paragraph has several references, attempt to split the paragraph. May need to massage the text of resulting splits. 4. If a paragraph has no references, attempt to merge with previous or next paragraph. EBM Summarisation Diego Mollá 26/60

Finding PubMed IDs Method 1. Split the reference text into sentences. 2. Remove author and pagination text: Use simple regexps. 3. Perform a sequence of searches with all combinations of sentences. EBM Summarisation Diego Mollá 27/60

Example I Collins NC. Is ice right? Does cryotherapy improve outcome for acute soft tissue injury? Emerg Med J. 2008; 25: 65-68. Collins NC. Is ice right? Does cryotherapy improve outcome for acute soft tissue injury Emerg Med J. 2008; 25: 65-68. EBM Summarisation Diego Mollá 28/60

Example II list search ID title match % 1, 2, 3 Is ice right? Does cryotherapy improve outcome for acute soft tissue injury? Emerg Med J 1, 2 Is ice right? Does cryotherapy improve outcome for acute soft tissue injury? 18212134 Is ice right? Does cryotherapy improve outcome for acute soft tissue injury? 18212134 Is ice right? Does cryotherapy improve outcome for acute soft tissue injury? 1, 3 Is ice right? Emerg Med J 18212134 Is ice right? Does cryotherapy improve outcome for acute soft tissue injury? 2, 3 Does cryotherapy improve outcome for acute soft tissue injury? Emerg Med J 18212134 Is ice right? Does cryotherapy improve outcome for acute soft tissue injury? 1 Is ice right? None None 0 2 Does cryotherapy improve outcome for acute soft tissue injury? 15496998 Does Cryotherapy Improve Outcomes With Soft Tissue Injury? 78 3 Emerg Med J None None 0 92 100 39 82 EBM Summarisation Diego Mollá 29/60

Using Amazon Mechanical Turk I Mechanics AMT was used to find the correct IDs. An AMT hit had 10 references: 2 known references for checking quality of annotation. Each hit was assigned to 5 Turkers. There was a preliminary training session. EBM Summarisation Diego Mollá 30/60

Using Amazon Mechanical Turk II Approving and rejecting hits Reject hit if there are two or more bad IDs, i.e. one of: A known ID is wrong. The ID is invalid: Not found in PubMed; No title is returned. The title of the ID does not match the title of our reference: threshold: 50% match. The ID does not agree with majority. EBM Summarisation Diego Mollá 31/60

Using Amazon Mechanical Turk III Checking validity for final annotation Majority wins automatically except when: majority is a bad ID; majority is the nf ID; the other two are agreeing ( full house ). Manual check is done in all other cases. EBM Summarisation Diego Mollá 32/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 33/60

Corpus Statistics Size 456 questions ( records ). 1,396 answers ( snips ). 3,036 text explanations ( longs ). 3,705 references: 2,908 unique references. 2,657 XML abstracts from PubMed. EBM Summarisation Diego Mollá 34/60

Answers per Question Avg=3.06 EBM Summarisation Diego Mollá 35/60

Answer justifications per answer Avg=2.17 EBM Summarisation Diego Mollá 36/60

References per answer justification Avg=1.22 EBM Summarisation Diego Mollá 37/60

References per question Avg=6.57 EBM Summarisation Diego Mollá 38/60

Evidence Grade EBM Summarisation Diego Mollá 39/60

References EBM Summarisation Diego Mollá 40/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 41/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 42/60

Evidence-based Summarisation Single Document Summarisation Input: Question, reference. Target: Text explanation. EBM Summarisation Diego Mollá 43/60

Evidence-based Summarisation Single Document Summarisation Input: Question, reference. Target: Text explanation. Multi-document Summarisation Input: Question, group of relevant references. Target: Answer parts (optional: plus text explanation). EBM Summarisation Diego Mollá 43/60

Appraisal, Clustering Text Classification for Appraisal Input: Group of references. Target: Evidence-based grade. EBM Summarisation Diego Mollá 44/60

Appraisal, Clustering Text Classification for Appraisal Input: Group of references. Target: Evidence-based grade. Clustering Input: Question, group of relevant references. Target: Cluster groupings (optional: plus answer parts). EBM Summarisation Diego Mollá 44/60

Retrieval? Possible task Input: Question. Target: List of references. EBM Summarisation Diego Mollá 45/60

Retrieval? Possible task Input: Question. Target: List of references. However... Some of the references are old. The references are likely not exhaustive. EBM Summarisation Diego Mollá 45/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 46/60

Input, Output Input Question. Document Abstract. Output Extractive summary that answers the question. Target summary is the annotated evidence text ( long ). Evaluated using ROUGE-L with Stemming. EBM Summarisation Diego Mollá 47/60

Baselines plain Return the last n sentences. keywords Return the last n sentences that share any non-stop words with the question. umls Return the last n sentences that share any UMLS concepts with the question. System F Conf Interval baseline plain 0.193 [0.190 0.196] baseline keywords 0.195 [0.192 0.198] baseline umls 0.194 [0.190 0.197] EBM Summarisation Diego Mollá 48/60

Using the Abstract Structure Preselect sentences and then: Abstract Section 1 S1.1 S1.2 Section 2 S2.1 Section 3 S3.1 S3.2 Section 4 S4.1 S4.2 Section 5 S5.1 S5.2 Summary EBM Summarisation Diego Mollá 49/60

Using the Abstract Structure Preselect sentences and then: 1. Use PubMed s section tags (background, conclusions, methods, objective, results). Abstract Background S1.1 S1.2 Methods S2.1 Results S3.1 S3.2 Conclusions S4.1 S4.2 Conclusions S5.1 S5.2 Summary EBM Summarisation Diego Mollá 49/60

Using the Abstract Structure Preselect sentences and then: 1. Use PubMed s section tags (background, conclusions, methods, objective, results). 2. Select the first n sentences of the last conclusions section. Abstract Background S1.1 S1.2 Methods S2.1 Results S3.1 S3.2 Conclusions S4.1 S4.2 Conclusions S5.1 S5.2 Summary S5.1 S5.2 EBM Summarisation Diego Mollá 49/60

Using the Abstract Structure Preselect sentences and then: 1. Use PubMed s section tags (background, conclusions, methods, objective, results). 2. Select the first n sentences of the last conclusions section. 3. If we have less than n sentences, fill from the first sentences of the previous conclusions section, and so on until all conclusions sections are used up. Abstract Background S1.1 S1.2 Methods S2.1 Results S3.1 S3.2 Conclusions S4.1 S4.2 Conclusions S5.1 S5.2 Summary S5.1 S5.2 S4.1 S4.2 EBM Summarisation Diego Mollá 49/60

Using the Abstract Structure Preselect sentences and then: 1. Use PubMed s section tags (background, conclusions, methods, objective, results). 2. Select the first n sentences of the last conclusions section. 3. If we have less than n sentences, fill from the first sentences of the previous conclusions section, and so on until all conclusions sections are used up. 4. If we have less than n sentences, fill from the results sections. Abstract Background S1.1 S1.2 Methods S2.1 Results S3.1 S3.2 Conclusions S4.1 S4.2 Conclusions S5.1 S5.2 Summary S5.1 S5.2 S4.1 S4.2 S3.1 EBM Summarisation Diego Mollá 49/60

Using the Abstract Structure Preselect sentences and then: 1. Use PubMed s section tags (background, conclusions, methods, objective, results). 2. Select the first n sentences of the last conclusions section. 3. If we have less than n sentences, fill from the first sentences of the previous conclusions section, and so on until all conclusions sections are used up. 4. If we have less than n sentences, fill from the results sections. 5. If we still have less than n sentences, fill from the methods sections. Abstract Background S1.1 S1.2 Methods S2.1 Results S3.1 S3.2 Conclusions S4.1 S4.2 Conclusions S5.1 S5.2 Summary S5.1 S5.2 S4.1 S4.2 S3.1 EBM Summarisation Diego Mollá 49/60

Using the Abstract Structure Preselect sentences and then: 1. Use PubMed s section tags (background, conclusions, methods, objective, results). 2. Select the first n sentences of the last conclusions section. 3. If we have less than n sentences, fill from the first sentences of the previous conclusions section, and so on until all conclusions sections are used up. 4. If we have less than n sentences, fill from the results sections. 5. If we still have less than n sentences, fill from the methods sections. 6. If the abstract has no structure, return the last n sentences. Abstract Background S1.1 S1.2 Methods S2.1 Results S3.1 S3.2 Conclusions S4.1 S4.2 Conclusions S5.1 S5.2 Summary S5.1 S5.2 S4.1 S4.2 S3.1 EBM Summarisation Diego Mollá 49/60

Results The F is calculated using ROUGE-L with stemming. System F Conf Interval baseline plain 0.193 [0.190 0.196] baseline keywords 0.195 [0.192 0.198] baseline umls 0.194 [0.190 0.197] structure plain 0.196 [0.193 0.199] structure keywords 0.193 [0.190 0.197] structure umls 0.192 [0.189 0.195] EBM Summarisation Diego Mollá 50/60

ROUGE-L with Stemming for All 3-Sentence Subsets I Process 1. Compute the ROUGE-L of all 3-sentence subsets in each abstract. 2. Find the decile boundaries in each abstract. 3. Find the distribution of decile boundaries. 0 1 2 3 4 5 Mean 0.094 0.136 0.153 0.164 0.176 0.188 Std Dev 0.060 0.062 0.065 0.067 0.070 0.073 6 7 8 9 10 Mean 0.200 0.213 0.229 0.249 0.299 Std Dev 0.076 0.081 0.087 0.094 0.112 EBM Summarisation Diego Mollá 51/60

ROUGE-L with Stemming for All 3-Sentence Subsets II EBM Summarisation Diego Mollá 52/60

Contents Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading EBM Summarisation Diego Mollá 53/60

ALTA 2011 Shared Task The ALTA Shared Tasks Competitions where all participants are evaluated on the same data. The ALTA 2011 shared task was based on evidence grading. The Data Clusters of abstracts. The SOR grade of each cluster. EBM Summarisation Diego Mollá 54/60

Data Sample Fragment 41711 B 10553790 15265350 53581 C 12804123 16026213 14627885 53583 B 15213586 52401 A 15329425 9058342 11279767 EBM Summarisation Diego Mollá 55/60

Words as Features Abstract n-grams Generated n-grams (n = 1, 2, 3, 4) for each of the abstracts. Replaced specific medical concepts with generic sem type tags using UMLS. Stemmed, lowercased, stop words removed. Title n-grams Generated n-grams (n = 1, 2) for each title. Processed in the same way as abstract n-grams. EBM Summarisation Diego Mollá 56/60

Publication Types as Features I Distribution of publication types in a different corpus. EBM Summarisation Diego Mollá 57/60

Publication Types as Features II Publication types Rule-based classifier to detect publication types. Simple regular expressions that identify major publication types. Used the publication types marked up by PubMed when available. If an article has several possible publication types, choose the one with highest quality. EBM Summarisation Diego Mollá 58/60

Cascaded Classification Process: Cascaded SVMs 1. Default class: B. 2. SVMs with abstract n-grams to identify A and C. 3. SVMs with publication types to identify A and C. 4. SVMs with title n-grams to identify A and C. Results Method Accuracy Confidence Intervals Majority (B) 48.63% 41.5 55.83 Cascaded SVMs 62.84% EBM Summarisation Diego Mollá 59/60

Questions? Evidence Based Medicine Our Corpus for Summarisation Structure of our Corpus How we Created the Corpus Statistics Applications Possible Uses Single-document Summarisation Evidence Grading Further Information http://web.science.mq.edu.au/~diego/medicalnlp/ EBM Summarisation Diego Mollá 60/60