Terms in Time and Times in Context: A Graph-based Term-Time Ranking Model
|
|
- Katherine Tucker
- 5 years ago
- Views:
Transcription
1 Terms in Time and Times in Context: A Graph-based Term-Time Ranking Model Andreas Spitz, Jannik Strötgen, Thomas Bögel and Michael Gertz Heidelberg University Institute of Computer Science Database Systems Research Group spitz@informatik.uni-heidelberg.de 5th Temporal Web Analytics Workshop Florence, May 18, 2015
2 What happened on June 15, 1215? A simple question. How simple is the answer? Terms in Time and Times in Context Andreas Spitz 1 of 24
3 Motivation Co-ooccurrence Graphs With structured data: quite simple Terms in Time and Times in Context Term-Ranking Projection Application Summary Based on unstructured text data: much more challenging Andreas Spitz 2 of 24
4 Data Set and Approach A corpus of all English Wikipedia articles: Only text is considered, no info-boxes 3, 079, 620 documents with time expressions Problem statement, given such a corpus: Extract and normalize temporal expressions (dates) Find key terms that best summarize a given date Terms in Time and Times in Context Andreas Spitz 3 of 24
5 Outline Outline of the approach: Represent date-term co-occurrences efficiently Extract and normalize temporal expressions (dates) Extract content words that co-occur with dates Generate an efficient data structure Based on this representation Identify relevant terms for any given date Identify similar dates for any given date Example applications Terms in Time and Times in Context Andreas Spitz 4 of 24
6 Extraction of Temporal Expressions Normalization, e.g., May 18, Handling relative temporal expressions, e.g., in May Considering the document type Source: Strötgen, Gertz Multilingual and Cross-domain Temporal Tagging (2013) Terms in Time and Times in Context Andreas Spitz 5 of 24
7 Coverage of Dates We use a combination of dates of three granularities: YYYY-MM-DD (day) YYYY-MM (month) YYYY (year) Percentage of dates that are included in the data per year year coverage in % Terms in Time and Times in Context Andreas Spitz 6 of 24
8 Extraction of Terms and Representation For all sentences s in any Wikipedia document: Terms in Time and Times in Context Andreas Spitz 7 of 24
9 Extraction of Terms and Representation Identify/normalize dates and remove stop words Terms in Time and Times in Context Andreas Spitz 7 of 24
10 Extraction of Terms and Representation Create a bipartite graph G s = (T s D s, E s ) with weights ω s Terms in Time and Times in Context Andreas Spitz 7 of 24
11 Extraction of Terms and Representation Satisfy the inclusion condition for dates Terms in Time and Times in Context Andreas Spitz 7 of 24
12 Extraction of Terms and Representation Satisfy the inclusion condition for dates Terms in Time and Times in Context Andreas Spitz 7 of 24
13 Graph aggregation Aggregate the sentence-graphs G s : T := T s D := D s E := E s ω(e) := ω s (e) We obtain G = (T D, E, ω) with: T = 3, 748, 730 terms D = 210, 375 dates E = 110, 639, 525 edges Terms in Time and Times in Context Andreas Spitz 8 of 24
14 Formalising the Question What happened on June 15, 1215? Which terms in the graph co-occur in a significant manner with the date ? Terms in Time and Times in Context Andreas Spitz 9 of 24
15 Ranking We need a ranking-function from dates D to a list of terms T r : D R T r(d) := ranking of terms t T by their significance for d Terms in Time and Times in Context Andreas Spitz 10 of 24
16 Ranking We need a ranking-function from dates D to a list of terms T r : D R T r(d) := ranking of terms t T by their significance for d Idea: adapt tf-idf to the bipartite graph tf-idf := tf log 1 df tf : frequency of term in document df : fraction of documents that contain the term Terms in Time and Times in Context Andreas Spitz 10 of 24
17 Adapting tf-idf How does this relate to the graph? Identify dates with documents, i.e., dates contain terms Term frequency given by edge weights: tf(d, t) ω(d, t) Inverse document frequency given by number of neighbours: idf(t) D deg(t) tf-idf := tf log 1 df tf-idf(d, t) := ω(d, t) log D deg(t) Terms in Time and Times in Context Andreas Spitz 11 of 24
18 June 15, 1215 Terms in Time and Times in Context Andreas Spitz 12 of 24
19 June 15, 1215 Terms in Time and Times in Context Andreas Spitz 12 of 24
20 A Ranking for Dates Ranking dates by term works analogously: tf-idf(t, d) := ω(t, d) log T deg(d) Terms in Time and Times in Context Andreas Spitz 13 of 24
21 A Ranking for Dates Ranking dates by term works analogously: tf-idf(t, d) := ω(t, d) log T deg(d) Terms in Time and Times in Context Andreas Spitz 13 of 24
22 Ranking Nodes by Similarity Within a Set Can we create a ranking for dates by dates?... or for terms by terms? Terms in Time and Times in Context Andreas Spitz 14 of 24
23 Ranking Nodes by Similarity Within a Set Can we create a ranking for dates by dates?... or for terms by terms? Formally this is a one-mode projection of the bipartite graph: Reduce graph to a single set of nodes T or D Connect nodes that share neighbours in the bipartite graph This results in a very dense graph How can we identify relevant edges in the projection? Terms in Time and Times in Context Andreas Spitz 14 of 24
24 Cosine Similarity of Adjacency Vectors In a lesson from Collaborative Filtering : use a cosine similarity of adjacency vectors cos(t a, t b ) := tai t bi t 2 ai t 2 bi Terms in Time and Times in Context Andreas Spitz 15 of 24
25 Evaluation Ground truth: U.S. Election Days ( ) Recurs annually Always on Tuesday after the first Monday in November (Nov 2 - Nov 8) Every four years: presidential election Terms in Time and Times in Context Andreas Spitz 16 of 24
26 Evaluation Ground truth: U.S. Election Days ( ) Recurs annually Always on Tuesday after the first Monday in November (Nov 2 - Nov 8) Every four years: presidential election Expectation: For a given election day, election days in other years are ranked highly For presidential election days, other presidential election days are ranked highly Terms in Time and Times in Context Andreas Spitz 16 of 24
27 Precision at k 0.6 precision k election day presidential general prec k := Election days in top k ranks k Terms in Time and Times in Context Andreas Spitz 17 of 24
28 Area Under the ROC Curve year AUC election day general presidential Terms in Time and Times in Context Andreas Spitz 18 of 24
29 Practical Application: Hot Spots & Key Players Here: approximation of countries activity during given months For each European country c, define its name, e.g. t n (c) = italy, define the countries adjectival form, e.g. t a (c) = italian, compute individual tf-idf scores for terms and combine. act(c, d) := tf-idf(d, t n(c)) + tf-idf(d, t a (c)) max[tf-idf(d, )] Terms in Time and Times in Context Andreas Spitz 19 of 24
30 Activity by Country During World War II March 1939 September 1939 November 1939 German occupation German invasion Soviet invasion of Czechoslovakia of Poland of Finland activity 1.5 April 1940 May 1940 October 1940 German assault German invasion Battle of Britain and on Norway of France Greco Italian War Terms in Time and Times in Context Andreas Spitz 20 of 24
31 Activity by Country During World War II (2) April 1941 June 1941 September 1943 Invasion of Yugoslavia German offensive Allied invasion and Greece against the Soviets of Italy June 1944 August 1944 May 1945 Allied forces land in Liberation of Paris German Capitulation Normandy (DDay) activity Terms in Time and Times in Context Andreas Spitz 21 of 24
32 Summary Approach: Extract dates and terms from unstructured text Construct a bipartite date-term graph Allows ranking dates / terms according to co-occurrences Benefits: Simple measures already yield good results Efficient: 4GB Memory and real-time queries Flexibility of ranking methods Terms in Time and Times in Context Andreas Spitz 22 of 24
33 Ongoing Work Terms in Time and Times in Context Andreas Spitz 23 of 24
34 Thank you! Questions? Terms in Time and Times in Context Andreas Spitz 24 of 24
So Far Away and Yet so Close: Augmenting Toponym Disambiguation and Similarity with Text-Based Networks
So Far Away and Yet so Close: Augmenting Toponym Disambiguation and Similarity with Text-Based Networks Andreas Spitz Johanna Geiß Michael Gertz Institute of Computer Science, Heidelberg University Im
More informationDiscrete distribution. Fitting probability models to frequency data. Hypotheses for! 2 test. ! 2 Goodness-of-fit test
Discrete distribution Fitting probability models to frequency data A probability distribution describing a discrete numerical random variable For example,! Number of heads from 10 flips of a coin! Number
More informationVector Space Model. Yufei Tao KAIST. March 5, Y. Tao, March 5, 2013 Vector Space Model
Vector Space Model Yufei Tao KAIST March 5, 2013 In this lecture, we will study a problem that is (very) fundamental in information retrieval, and must be tackled by all search engines. Let S be a set
More informationWhat is Text mining? To discover the useful patterns/contents from the large amount of data that can be structured or unstructured.
What is Text mining? To discover the useful patterns/contents from the large amount of data that can be structured or unstructured. Text mining What can be used for text mining?? Classification/categorization
More informationVariable Latent Semantic Indexing
Variable Latent Semantic Indexing Prabhakar Raghavan Yahoo! Research Sunnyvale, CA November 2005 Joint work with A. Dasgupta, R. Kumar, A. Tomkins. Yahoo! Research. Outline 1 Introduction 2 Background
More informationLatent Geographic Feature Extraction from Social Media
Latent Geographic Feature Extraction from Social Media Christian Sengstock* Michael Gertz Database Systems Research Group Heidelberg University, Germany November 8, 2012 Social Media is a huge and increasing
More informationPYROGEOGRAPHY OF THE IBERIAN PENINSULA
PYROGEOGRAPHY OF THE IBERIAN PENINSULA Teresa J. Calado (1), Carlos C. DaCamara (1), Sílvia A. Nunes (1), Sofia L. Ermida (1) and Isabel F. Trigo (1,2) (1) Instituto Dom Luiz, Universidade de Lisboa, Lisboa,
More informationNatural Language Processing. Topics in Information Retrieval. Updated 5/10
Natural Language Processing Topics in Information Retrieval Updated 5/10 Outline Introduction to IR Design features of IR systems Evaluation measures The vector space model Latent semantic indexing Background
More informationLatent Semantic Analysis. Hongning Wang
Latent Semantic Analysis Hongning Wang CS@UVa Recap: vector space model Represent both doc and query by concept vectors Each concept defines one dimension K concepts define a high-dimensional space Element
More informationData Mining Recitation Notes Week 3
Data Mining Recitation Notes Week 3 Jack Rae January 28, 2013 1 Information Retrieval Given a set of documents, pull the (k) most similar document(s) to a given query. 1.1 Setup Say we have D documents
More informationWiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte
Reputation Systems I HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte Yury Lifshits Wiki Definition Reputation is the opinion (more technically, a social evaluation) of the public toward a person, a
More informationCAIM: Cerca i Anàlisi d Informació Massiva
1 / 21 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim
More informationRetrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1
Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency Srihari: CSE 626 1 Text Retrieval Retrieval of text-based information is referred to as Information Retrieval (IR)
More informationSTATISTICAL FORECASTING and SEASONALITY (M. E. Ippolito; )
STATISTICAL FORECASTING and SEASONALITY (M. E. Ippolito; 10-6-13) PART I OVERVIEW The following discussion expands upon exponential smoothing and seasonality as presented in Chapter 11, Forecasting, in
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 6: Numerical Linear Algebra: Applications in Machine Learning Cho-Jui Hsieh UC Davis April 27, 2017 Principal Component Analysis Principal
More informationCross-lingual and temporal Wikipedia analysis
MTA SZTAKI Data Mining and Search Group June 14, 2013 Supported by the EC FET Open project New tools and algorithms for directed network analysis (NADINE No 288956) Table of Contents 1 Link prediction
More informationChap 2: Classical models for information retrieval
Chap 2: Classical models for information retrieval Jean-Pierre Chevallet & Philippe Mulhem LIG-MRIM Sept 2016 Jean-Pierre Chevallet & Philippe Mulhem Models of IR 1 / 81 Outline Basic IR Models 1 Basic
More informationHELCOM-VASAB Maritime Spatial Planning Working Group Twelfth Meeting Gdansk, Poland, February 2016
HELCOM-VASAB Maritime Spatial Planning Working Group Twelfth Meeting Gdansk, Poland, 24-25 February 2016 Document title HELCOM database for the coastal and marine Baltic Sea protected areas (HELCOM MPAs).
More informationPredicates and Quantifiers
Predicates and Quantifiers Lecture 9 Section 3.1 Robb T. Koether Hampden-Sydney College Wed, Jan 29, 2014 Robb T. Koether (Hampden-Sydney College) Predicates and Quantifiers Wed, Jan 29, 2014 1 / 32 1
More informationBoolean and Vector Space Retrieval Models
Boolean and Vector Space Retrieval Models Many slides in this section are adapted from Prof. Joydeep Ghosh (UT ECE) who in turn adapted them from Prof. Dik Lee (Univ. of Science and Tech, Hong Kong) 1
More informationSpatializing time in a history text corpus
Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2014 Spatializing time in a history text corpus Bruggmann, André; Fabrikant,
More informationRETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic
More informationSparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.
ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Previous lectures: Sparse vectors recap How to represent
More informationANLP Lecture 22 Lexical Semantics with Dense Vectors
ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Henry S. Thompson ANLP Lecture 22 5 November 2018 Previous
More informationA Plot of the Tracking Signals Calculated in Exhibit 3.9
CHAPTER 3 FORECASTING 1 Measurement of Error We can get a better feel for what the MAD and tracking signal mean by plotting the points on a graph. Though this is not completely legitimate from a sample-size
More informationSemantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing
Semantics with Dense Vectors Reference: D. Jurafsky and J. Martin, Speech and Language Processing 1 Semantics with Dense Vectors We saw how to represent a word as a sparse vector with dimensions corresponding
More informationArgo data management November 2nd, 2005 Ref : cordo/dti-rap/ Version 1.1 ARGO DATA MANAGEMENT REPORT FRENCH DAC
Argo data management November 2nd, 2005 Ref : cordo/dti-rap/05-145 Version 1.1 ARGO DATA MANAGEMENT REPORT FRENCH DAC 2 Argo National Data Management Report of France November 2005 Introduction This document
More informationInformation Retrieval and Web Search
Information Retrieval and Web Search IR models: Vector Space Model IR Models Set Theoretic Classic Models Fuzzy Extended Boolean U s e r T a s k Retrieval: Adhoc Filtering Brosing boolean vector probabilistic
More informationDISTRIBUTIONAL SEMANTICS
COMP90042 LECTURE 4 DISTRIBUTIONAL SEMANTICS LEXICAL DATABASES - PROBLEMS Manually constructed Expensive Human annotation can be biased and noisy Language is dynamic New words: slangs, terminology, etc.
More informationSemantic Similarity from Corpora - Latent Semantic Analysis
Semantic Similarity from Corpora - Latent Semantic Analysis Carlo Strapparava FBK-Irst Istituto per la ricerca scientifica e tecnologica I-385 Povo, Trento, ITALY strappa@fbk.eu Overview Latent Semantic
More informationAPPENDIX M LAKE ELEVATION AND FLOW RELEASES SENSITIVITY ANALYSIS RESULTS
APPENDIX M LAKE ELEVATION AND FLOW RELEASES SENSITIVITY ANALYSIS RESULTS Appendix M Lake Elevation and Flow Releases Sensitivity Analysis Results Figure M-1 Lake Jocassee Modeled Reservoir Elevations (Current
More informationEssential Maths Skills. for AS/A-level. Geography. Helen Harris. Series Editor Heather Davis Educational Consultant with Cornwall Learning
Essential Maths Skills for AS/A-level Geography Helen Harris Series Editor Heather Davis Educational Consultant with Cornwall Learning Contents Introduction... 5 1 Understanding data Nominal, ordinal and
More informationWHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities
WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and 2001-2002 Rainfall For Selected Arizona Cities Phoenix Tucson Flagstaff Avg. 2001-2002 Avg. 2001-2002 Avg. 2001-2002 October 0.7 0.0
More informationComputation of Large Sparse Aggregated Areas for Analytic Database Queries
Computation of Large Sparse Aggregated Areas for Analytic Database Queries Steffen Wittmer Tobias Lauer Jedox AG Collaborators: Zurab Khadikov Alexander Haberstroh Peter Strohm Business Intelligence and
More informationDensest subgraph computation and applications in finding events on social media
Densest subgraph computation and applications in finding events on social media Oana Denisa Balalau advised by Mauro Sozio Télécom ParisTech, Institut Mines Télécom December 4, 2015 1 / 28 Table of Contents
More informationAs it is not necessarily possible to satisfy this equation, we just ask for a solution to the more general equation
Graphs and Networks Page 1 Lecture 2, Ranking 1 Tuesday, September 12, 2006 1:14 PM I. II. I. How search engines work: a. Crawl the web, creating a database b. Answer query somehow, e.g. grep. (ex. Funk
More informationWorld Book Online: World War II
World Book Online: The trusted, student-friendly online reference tool. World Book Kids Database Name: Date: World War II was the most destructive war in world history. It killed more people, destroyed
More informationChapter 10: Information Retrieval. See corresponding chapter in Manning&Schütze
Chapter 10: Information Retrieval See corresponding chapter in Manning&Schütze Evaluation Metrics in IR 2 Goal In IR there is a much larger variety of possible metrics For different tasks, different metrics
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Retrieval Models and Implementation Ulf Leser Content of this Lecture Information Retrieval Models Boolean Model Vector Space Model Inverted Files Ulf Leser: Maschinelle
More informationDrought News August 2014
European Drought Observatory (EDO) Drought News August 2014 (Based on data until the end of July) http://edo.jrc.ec.europa.eu August 2014 EDO (http://edo.jrc.ec.europa.eu) Page 2 of 8 EDO Drought News
More informationTurn: Summer Economic Trend +1
Turn: Summer 95 Random events Economic Climate European Aggression Index Support Effects Factories Income Expenditures Activity Counters Mobilizations Axis and Allied Forces Balance of Power Spy Rings
More informationTemporal Evolution of the ScientificCollaboration Network in Europe
Temporal Evolution of the ScientificCollaboration Network in Europe Jori Jämsä 23.3.2015 Instructor: Ph.D Raj Kumar Pan Supervisor: Prof. Harri Ehtamo This work can be saved and published in Aalto University
More informationCQC. Composing Quantum Channels
CQC Composing Quantum Channels Michael Wolf echnical University of Munich (UM) Matthias Christandl EH Zurich (EH) David Perez-Garcia Universidad Complutense de Madrid (UCM) Presenter: David Reeb, postdoc
More informationCell- Each box of information in a chart or table. A cell is named by its column and row
Lesson 7 Cell- Each box of information in a chart or table. A cell is named by its column and row Steps to read a chart or table: Check the title to find out what type of information is being presented.
More informationThe observed global warming of the lower atmosphere
WATER AND CLIMATE CHANGE: CHANGES IN THE WATER CYCLE 3.1 3.1.6 Variability of European precipitation within industrial time CHRISTIAN-D. SCHÖNWIESE, SILKE TRÖMEL & REINHARD JANOSCHITZ SUMMARY: Precipitation
More informationOntology-Based News Recommendation
Ontology-Based News Recommendation Wouter IJntema Frank Goossen Flavius Frasincar Frederik Hogenboom Erasmus University Rotterdam, the Netherlands frasincar@ese.eur.nl Outline Introduction Hermes: News
More informationPROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability
More informationPaper Reference. 1312/1F Edexcel GCSE Geography A Paper 1F. Foundation Tier. Monday 5 June 2006 Morning Time: 1 hour 45 minutes
Centre No. Paper Reference Surname Initial(s) Candidate No. 1 3 1 2 1 F Signature Paper Reference(s) 1312/1F Edexcel GCSE Geography A Paper 1F Foundation Tier Monday 5 June 2006 Morning Time: 1 hour 45
More informationText Analytics (Text Mining)
http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Text Analytics (Text Mining) Concepts, Algorithms, LSI/SVD Duen Horng (Polo) Chau Assistant Professor Associate Director, MS
More informationText Mining. March 3, March 3, / 49
Text Mining March 3, 2017 March 3, 2017 1 / 49 Outline Language Identification Tokenisation Part-Of-Speech (POS) tagging Hidden Markov Models - Sequential Taggers Viterbi Algorithm March 3, 2017 2 / 49
More informationDesign of Indicators to monitor Strategic priorities: Mapping Knowledge Clusters
Design of Indicators to monitor Strategic priorities: Mapping Knowledge Clusters 5 th International Workshop Sharing the Best Practices in R&D and Education Statistics - Informing National and Institutional
More informationStudents will be able to simplify numerical expressions and evaluate algebraic expressions. (M)
Morgan County School District Re-3 August What is algebra? This chapter develops some of the basic symbolism and terminology that students may have seen before but still need to master. The concepts of
More informationIntroduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model Ranked retrieval Thus far, our queries have all been Boolean. Documents either
More informationWHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region
A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region Table : Reported cases for the period September 207 August 208 (data as of 0 October 208) Population
More informationA Convolutional Neural Network-based
A Convolutional Neural Network-based Model for Knowledge Base Completion Dat Quoc Nguyen Joint work with: Dai Quoc Nguyen, Tu Dinh Nguyen and Dinh Phung April 16, 2018 Introduction Word vectors learned
More informationTopic Models and Applications to Short Documents
Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text
More informationINTRODUCTION. In March 1998, the tender for project CT.98.EP.04 was awarded to the Department of Medicines Management, Keele University, UK.
INTRODUCTION In many areas of Europe patterns of drug use are changing. The mechanisms of diffusion are diverse: introduction of new practices by new users, tourism and migration, cross-border contact,
More informationMapOSMatic, free city maps for everyone!
MapOSMatic, free city maps for everyone! Thomas Petazzoni thomas.petazzoni@enix.org Libre Software Meeting 2012 http://www.maposmatic.org Thomas Petazzoni () MapOSMatic: free city maps for everyone! July
More informationASSESSMENT OF DIFFERENT WATER STRESS INDICATORS BASED ON EUMETSAT LSA SAF PRODUCTS FOR DROUGHT MONITORING IN EUROPE
ASSESSMENT OF DIFFERENT WATER STRESS INDICATORS BASED ON EUMETSAT LSA SAF PRODUCTS FOR DROUGHT MONITORING IN EUROPE G. Sepulcre Canto, A. Singleton, J. Vogt European Commission, DG Joint Research Centre,
More informationLink Analysis Ranking
Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query
More informationWHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region
A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region Table : Reported cases for the period November 207 October 208 (data as of 30 November
More informationAalborg Universitet. FuzzyPR Christensen, Hans Ulrich; Ortiz-Arroyo, Daniel. Published in: Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Aalborg Universitet FuzzyPR Christensen, Hans Ulrich; Ortiz-Arroyo, Daniel Published in: Rough Sets, Fuzzy Sets, Data Mining and Granular Computing DOI (link to publication from Publisher): 10.1007/978-3-540-72530-5_23
More informationJohn Conway s Doomsday Algorithm
1 The algorithm as a poem John Conway s Doomsday Algorithm John Conway introduced the Doomsday Algorithm with the following rhyme: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 The last of Feb., or of Jan. will do
More informationChapter 1 0+7= 1+6= 2+5= 3+4= 4+3= 5+2= 6+1= 7+0= How would you write five plus two equals seven?
Chapter 1 0+7= 1+6= 2+5= 3+4= 4+3= 5+2= 6+1= 7+0= If 3 cats plus 4 cats is 7 cats, what does 4 olives plus 3 olives equal? olives How would you write five plus two equals seven? Chapter 2 Tom has 4 apples
More informationRanked IR. Lecture Objectives. Text Technologies for Data Science INFR Learn about Ranked IR. Implement: 10/10/2018. Instructor: Walid Magdy
Text Technologies for Data Science INFR11145 Ranked IR Instructor: Walid Magdy 10-Oct-2018 Lecture Objectives Learn about Ranked IR TFIDF VSM SMART notation Implement: TFIDF 2 1 Boolean Retrieval Thus
More informationDimensionality reduction
Dimensionality reduction ML for NLP Lecturer: Kevin Koidl Assist. Lecturer Alfredo Maldonado https://www.cs.tcd.ie/kevin.koidl/cs4062/ kevin.koidl@scss.tcd.ie, maldonaa@tcd.ie 2017 Recapitulating: Evaluating
More informationPageRank algorithm Hubs and Authorities. Data mining. Web Data Mining PageRank, Hubs and Authorities. University of Szeged.
Web Data Mining PageRank, University of Szeged Why ranking web pages is useful? We are starving for knowledge It earns Google a bunch of money. How? How does the Web looks like? Big strongly connected
More informationModelling and projecting the postponement of childbearing in low-fertility countries
of childbearing in low-fertility countries Nan Li and Patrick Gerland, United Nations * Abstract In most developed countries, total fertility reached below-replacement level and stopped changing notably.
More informationInformation Retrieval
Introduction to Information Retrieval Lecture 11: Probabilistic Information Retrieval 1 Outline Basic Probability Theory Probability Ranking Principle Extensions 2 Basic Probability Theory For events A
More informationMultivariate Analysis
Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X 12... X 1p 2.
More informationFundamentals of Similarity Search
Chapter 2 Fundamentals of Similarity Search We will now look at the fundamentals of similarity search systems, providing the background for a detailed discussion on similarity search operators in the subsequent
More informationPropositional Calculus. Problems. Propositional Calculus 3&4. 1&2 Propositional Calculus. Johnson will leave the cabinet, and we ll lose the election.
1&2 Propositional Calculus Propositional Calculus Problems Jim Woodcock University of York October 2008 1. Let p be it s cold and let q be it s raining. Give a simple verbal sentence which describes each
More informationSDI in Lombardia (Italy(
SDI in Lombardia (Italy( Italy) Andrea Piccin European SDI Best Practice Awards 2009 - Learning from Best Practices Turin, 26th and 27th November 2009 Lombardia, in Italy, is 4 th Region for territorial
More informationRanked IR. Lecture Objectives. Text Technologies for Data Science INFR Learn about Ranked IR. Implement: 10/10/2017. Instructor: Walid Magdy
Text Technologies for Data Science INFR11145 Ranked IR Instructor: Walid Magdy 10-Oct-017 Lecture Objectives Learn about Ranked IR TFIDF VSM SMART notation Implement: TFIDF 1 Boolean Retrieval Thus far,
More informationPart A. P (w 1 )P (w 2 w 1 )P (w 3 w 1 w 2 ) P (w M w 1 w 2 w M 1 ) P (w 1 )P (w 2 w 1 )P (w 3 w 2 ) P (w M w M 1 )
Part A 1. A Markov chain is a discrete-time stochastic process, defined by a set of states, a set of transition probabilities (between states), and a set of initial state probabilities; the process proceeds
More informationP7.7 A CLIMATOLOGICAL STUDY OF CLOUD TO GROUND LIGHTNING STRIKES IN THE VICINITY OF KENNEDY SPACE CENTER, FLORIDA
P7.7 A CLIMATOLOGICAL STUDY OF CLOUD TO GROUND LIGHTNING STRIKES IN THE VICINITY OF KENNEDY SPACE CENTER, FLORIDA K. Lee Burns* Raytheon, Huntsville, Alabama Ryan K. Decker NASA, Marshall Space Flight
More informationHow Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa
How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa 1 Outline Focus of the study Data Dispersion and forecast errors during turning points Testing efficiency
More informationInformation Retrieval
Introduction to Information Retrieval Lecture 12: Language Models for IR Outline Language models Language Models for IR Discussion What is a language model? We can view a finite state automaton as a deterministic
More informationLab Activity: Climate Variables
Name: Date: Period: Water and Climate The Physical Setting: Earth Science Lab Activity: Climate Variables INTRODUCTION:! The state of the atmosphere continually changes over time in response to the uneven
More informationBlazar Optical Sky Survey BOSS Project ( ) and the long-term optical variability monitoring
Blazar Optical Sky Survey BOSS Project (2013-2016) and the long-term optical variability monitoring Kosmas Gazeas University of Athens Observatory - UOAO Department of Astrophysics, Astronomy and Mechanics
More informationSpace Research at Hvar Observatory
Space Research at Hvar Observatory Roman Brajša Hvar Observatory Faculty of Geodesy, University of Zagreb http://oh.geof.unizg.hr SCERIN-6 Capacity Building Workshop on Earth System Observations Zagreb,
More informationSustainable City Index SCI-2014
Sustainable City Index SCI-2014 Summary Since its successful launch in 2006, the Sustainable Society Index (SSI) still remains the only index that shows in quantitative terms the level of sustainability
More informationCorroborating Information from Disagreeing Views
Corroboration A. Galland WSDM 2010 1/26 Corroborating Information from Disagreeing Views Alban Galland 1 Serge Abiteboul 1 Amélie Marian 2 Pierre Senellart 3 1 INRIA Saclay Île-de-France 2 Rutgers University
More informationMachine Learning CPSC 340. Tutorial 12
Machine Learning CPSC 340 Tutorial 12 Random Walk on Graph Page Rank Algorithm Label Propagation on Graph Assume a strongly connected graph G = (V, A) Label Propagation on Graph Assume a strongly connected
More informationDetermine the trend for time series data
Extra Online Questions Determine the trend for time series data Covers AS 90641 (Statistics and Modelling 3.1) Scholarship Statistics and Modelling Chapter 1 Essent ial exam notes Time series 1. The value
More informationPublic Library Use and Economic Hard Times: Analysis of Recent Data
Public Library Use and Economic Hard Times: Analysis of Recent Data A Report Prepared for The American Library Association by The Library Research Center University of Illinois at Urbana Champaign April
More informationCentral Coast Tracking Trash. Trash Webinar #3 September 19, 2017
Central Coast Tracking Trash Trash Webinar #3 September 19, 2017 Webinar #3: Agenda Trash App v0.2 release Intro to tracking annual progress towards compliance Intro to prioritizing actions Next webinar
More informationBathing water results 2011 Slovakia
Bathing water results Slovakia 1. Reporting and assessment This report gives a general overview of water in Slovakia for the season. Slovakia has reported under the Directive 2006/7/EC since 2008. When
More informationDealing with Text Databases
Dealing with Text Databases Unstructured data Boolean queries Sparse matrix representation Inverted index Counts vs. frequencies Term frequency tf x idf term weights Documents as vectors Cosine similarity
More informationIdentification of Very Shallow Groundwater Regions in the EU to Support Monitoring
Identification of Very Shallow Groundwater Regions in the EU to Support Monitoring Timothy Negley Paul Sweeney Lucy Fish Paul Hendley Andrew Newcombe ARCADIS Syngenta Ltd. Syngenta Ltd. Phasera Ltd. ARCADIS
More informationJANUARY MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY
Vocabulary (01) The Calendar (012) In context: Look at the calendar. Then, answer the questions. JANUARY MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY 1 New 2 3 4 5 6 Year s Day 7 8 9 10 11
More informationThree right directions and three wrong directions for tensor research
Three right directions and three wrong directions for tensor research Michael W. Mahoney Stanford University ( For more info, see: http:// cs.stanford.edu/people/mmahoney/ or Google on Michael Mahoney
More informationAn Implementation of Mobile Sensing for Large-Scale Urban Monitoring
An Implementation of Mobile Sensing for Large-Scale Urban Monitoring Teerayut Horanont 1, Ryosuke Shibasaki 1,2 1 Department of Civil Engineering, University of Tokyo, Meguro, Tokyo 153-8505, JAPAN Email:
More informationA re-sampling based weather generator
A re-sampling based weather generator Sara Martino 1 Joint work with T. Nipen 2 and C. Lussana 2 1 Sintef Energy Resources 2 Norwegian Metereologic Institute Berlin 19th Sept. 2017 Sara Martino Joint work
More informationBoolean and Vector Space Retrieval Models CS 290N Some of slides from R. Mooney (UTexas), J. Ghosh (UT ECE), D. Lee (USTHK).
Boolean and Vector Space Retrieval Models 2013 CS 290N Some of slides from R. Mooney (UTexas), J. Ghosh (UT ECE), D. Lee (USTHK). 1 Table of Content Boolean model Statistical vector space model Retrieval
More informationcarbon dioxide +... (+ light energy) glucose +...
Photosynthesis 1. (i) Complete the word equation for photosynthesis. (ii) carbon dioxide +... (+ light energy) glucose +... Most of the carbon dioxide that a plant uses during photosynthesis is absorbed
More informationOutline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting
Outline for today Information Retrieval Efficient Scoring and Ranking Recap on ranked retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University Efficient
More informationREMOTELY SENSED INFORMATION FOR CROP MONITORING AND FOOD SECURITY
LEARNING OBJECTIVES Lesson 4 Methods and Analysis 2: Rainfall and NDVI Seasonal Graphs At the end of the lesson, you will be able to: understand seasonal graphs for rainfall and NDVI; describe the concept
More informationOnline Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales
Online Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales 1 Table A.1 The Eurobarometer Surveys The Eurobarometer surveys are the products of a unique program
More informationINTEROPERABILITY AND DATA HOMOGENIZATION
Page 1 of 5 The Transnational Geo-portal Italian-Slovenian of the Cross-Border Park Area Eva Savina Malinverni DARDUS Università Politecnica delle Marche Ancona, IT e.s.malinverni@univpm.it INTRODUCTION
More information