Extended IR Models. Johan Bollen Old Dominion University Department of Computer Science
|
|
- Naomi Holmes
- 5 years ago
- Views:
Transcription
1 Extended IR Models. Johan Bollen Old Dominion University Department of Computer Science jbollen January 20, 2004 Page 1
2 UserTask Retrieval Classic Model Boolean Fuzzy Ext Boolean Vect Gen. Vec LSI NN Prob Inf Netw Struct Models Non-overlap Lists Belief Netw Prox Nodes Browsing Flat Guided HT January 20, 2004 Page 1
3 Extended Set Theoretic Models 1. Set Theoretic Models or Boolean Retrieval: (a) Do now allow partial matching (b) Do now allow ranking of documents (c) However, efficient and widespread 2. Fuzzy sets: (a) Allows membership degree to be determined (b) Retains set theoretic origins of Boolean IR (c) Ranking of results 3. Extended Boolean: (a) Based on VSM principles (b) More frequently deployed than simple Boolean (c) Ranking of results January 20, 2004 Page 2
4 Fuzzy sets and Information Retrieval 1. Traditional logic: (a) True or false logic (b) No perhaps, somewhat, largely (c) Cold vs warm: balmy? hot? (d) Need to connect logic to linguistic variables 2. Attempts to produce multi-valued logics 3. How about infinity-valued logic? (a) Fuzzy sets vs. crisp sets (b) Introduced in 1960s (c) Each element is assigned membership value [0,1] (d) Standard boolean operators: i. Union max ii. Intersection min iii. Negation: 1-membership value 4. Use of keyterm-keyterm similarities to define fuzzy result set January 20, 2004 Page 3
5 Fuzzy sets Universe of discourse: U all possible elements A Fuzzy subset of U: characterized by membership function µ A : µ A : U [0,1] Each u U is mapped to [0, 1], i.e. µa(u) [0.1] So, every element is assigned a value which indicates the degree to which it is a member of a set, which is a subset of the universe of discourse. Boolean Operations: µ A = 1 µ A (u) µ A B (u) = max(µ A (u),µ B (u)) µ A B (u) = min(µ A (u),µ B (u)) Question: membership functions? January 20, 2004 Page 4
6 Fuzzy sets: Example Universe of discourse: U set of people = U set of hip people = H fuzzy subset of U hip people are young and and like DJ s Precision and Recall. Two subsets: young and likes Precision and Recall Membership functions: January 20, 2004 Page 5
7 Young January 20, 2004 Page 6
8 Likes Precision and Recall January 20, 2004 Page 7
9 Young AND Likes Precision and Recall January 20, 2004 Page 8
10 Young OR Likes Precision and Recall January 20, 2004 Page 9
11 Fuzzy sets: extension of Boolean retrieval model 1. Partially based on term-term correlation matrix 2. Represented as thesaurus 3. Calculated from ratio of documents that contain pair of terms vs. number of documents which contain either matric c: term-term correlation matrix c i,l = n i,l n i +n j n i,l similar to bibliographic coupling (Kessler 1963) and co-citation (Small 1973). fuzzy set is defined on the basis of keyterm k i. Membership function for document d j is defined as: µ i, j = 1 kl d j (1 c i.l Document d j belongs to fuzzy set when at least a number of keyterms in document are close to k i. January 20, 2004 Page 10
12 Remember Boolean retrieval model 1. Boolean expression 2. Converted to DNF Boolean query expresses user information need 3. query matches to DNF components For example, query: [q = k a (k b k c ] [ q dn f = (1,1,1) (1,1,0) (1,0,0)] for the tuple (k a,k b,k c ) Let cc i be i-th conjunctive component. Let D a be the fuzzy set of documents with index k a D a contains documents for which µ a, j > K, K is threshold. Same for D b and D c. D q is then union of fuzzy sets associated with cc 1, cc 2 and cc 3, three conjunctive components. January 20, 2004 Page 11
13 Fuzzy Conjunctive Components January 20, 2004 Page 12
14 Remember Boolean retrieval model Membership degree µ q, j is then defined as: µ q, j = µ cc1 +cc 2 +cc 3 i.e. sum of membership degrees. µ q, j = µ cc1 +cc 2 +cc 3 = 1 3 i=1 (1 µ cc i, j ) = 1 (1 µ a, j µ b, j µ c, j ) (1 µ a, j µ b, j (1 µ c, j )) (1 µ a, j (1 µ b, j )(1 µ c, j )) January 20, 2004 Page 13
15 Fuzzy Conjunctive Components January 20, 2004 Page 14
16 Fuzzy sets Information Retrieval 1. Limited appeal (a) Few applications (b) Applications to recommender systems (c) Different ways to construct thesaurus (d) Scalability: term-term matrix 2. Relations to query expansion and neural network approaches January 20, 2004 Page 15
17 Extended Boolean Model 1. Boolean model: (a) Retrieval is brittle (b) Precise boolean query is difficult to generate 2. Extended (a) Features of vector space model (b) Keyterm weighing (c) Partial Matching January 20, 2004 Page 16
18 Extended Boolean Model 1. Main principles (a) Represent document in n-dimensional term space (b) x, y and z.. coordinates determined by term weights (c) depending on conjunction or disjunction: i. determine vector distance from (0,0) ii. determine vector distance from (1.0) (d) Distance calculation: i. Concept of p-norm ii. Varying characteristics of extended model 2. Discussion will focus on 2-dimensional problems January 20, 2004 Page 17
19 Term weighing and document vectors in boolean term space Document is assigned coordinates by term weighing t number of terms t-dimensional space For example, x coordinate: w x, j = f x, j idf x max i idf i Assume two keyterms, x and y coordinates as above Depending on Boolean query type, certain regions of space are either required or to be avoided: January 20, 2004 Page 18
20 January 20, 2004 Page 19
21 Query - document similarity measures Similarities are calculated on the basis of distance to desired coordinates. AND: distance to right-upper corner OR: general distance from left-bottom corner normalized by denominator 2 sim(qor, d) = sim(q and,d) = 1 x 2 +y 2 2 (1 x) 2 +(1 y) 2 2 Characteristics: if w x, j {0,1}: document always in four corners AND OR January 20, 2004 Page 20
22 Query - document similarity measures: p-norm Generally t-dimensional space Use of generalized vector norm: p-norm model sim(q and,d j ) = 1 a parameter 1 p inf General disjunctive query (under p-norm: qor = k 1 p k 2 p k m sim(q or,d j ) = ( x p i +xp i + +xp m m General conjunctive query (under p-norm: qor = k 1 p k 2 p k m ) 1 p p = 1: norm is sum of weights: similar ( (1 xi ) p +(1 x i ) p + +(1 x m ) p m to vector space model p = inf: sim(q or,d j ) = max(x i ) sim(q and,d j ) = min(x i ) Thus: fuzzy logic! System can thus be made to naturally adapt to range of behaviors, VSM fuzzy ) 1 p January 20, 2004 Page 21
23 Extended Boolean Model 1. Promising features 2. Apparently introduced in Found few applications since (a) Appeal of VSM (b) interpretation of p-norm (c) Provides theoretical framework 4. Evaluation: (a) Criterion: user satisfaction, result relevance (b) No applications to large collection that I know of (c) Outperforms VSM? Boolean? January 20, 2004 Page 22
24 Extended Algabraic Models 1. Terms are often thought of as orthogonal in VSM (a) provide basis of space (b) one term weight another term weight 2. Not true: (a) terms in language occur in groups (b) co-occurence of term groups (c) Natural consequence of semantic nature of information 3. Extended VSM: (a) Assumes linear independence of terms (b) seeks orthogonal vectors to be used as subspace January 20, 2004 Page 23
25 Minterms and Orthogonal vectors Given a set of t index terms {k 1,k 2,,k t } Assume w i, j is weight associated with [k i,d j ] and w i, j {0,1} (binary) g i (m j ) returns weight of index term k i in minterm m 1 2 t minterms: possible patterns of term occurence: for example: m 1 = (0,0,,0) m 2 = (0,0,,1) m 2 t = (1,1,,1) We define a set of orthogonal vectors: m 1 = (1,0,,0) m 2 = (0,1,,0) m 2 t = (0,0,,1) each associated with an element of the set of minterms. January 20, 2004 Page 24
26 Minterms and Orthogonal vectors - Example assuming 3 keyterms: minterm (basic) vector (0,0,0) (1,0,0,0,0,0,0,0) (1,0,0) (0,1,0,0,0,0,0,0) (0,1,0) (0,0,1,0,0,0,0,0) (1,1,0) (0,0,0,1,0,0,0,0) (0,0,1) (0,0,0,0,1,0,0,0) (1,0,1) (0,0,0,0,0,1,0,0) (0,1,1) (0,0,0,0,0,0,1,0) (1,1,1) (0,0,0,0,0,0,0,1) Although m i m j = 0, m 4 k 1,k 2 : co-occurence of keyterms when there is a document that contains both k 1 and k 2 we say that minterm m 4 is active. Number of active minterms? January 20, 2004 Page 25
27 Deriving index term vectors We define correlation factors c i r = d j g l ( d j )=g l (m r ) l w i, j This is essentially a count of the frequency by which each keyterm occured in an active minterm We generate keyterm vectors A linear combination of all basic vectors corresponding to minterms having nonzero correlation factors for the specific term. k i = r,g i mr=1 c i,r m r r,gi (mr)=1 c2 i, j So, once the keyterm vectors have been produced, we can translate queries etc to minterm vectors, and calculate query-document similarities. However, index term correlations: product of term vectors: k i k j = r gi (m r )=1 g j (m r )=1 c i,r c j,r January 20, 2004 Page 26
28 Whhaaaa??? OK OK, that wasn t very clear. We need an example. Let s say we have a set of 6 documents and three keyterms document-keyterm matrix d 1 d 2 d 3 d 4 d 5 d 6 k k k January 20, 2004 Page 27
29 Minterms c i r = d j g l ( d j )=g l (m r ) l w i, j m r active c 1,r c 2,r c 3,r (0, 0, 0) NO (1,0,0) Y ES (0,1,0) Y ES (1,1,0) Y ES (0, 0, 1) NO (1, 0, 1) NO (0,1,1) Y ES (1, 1, 1) NO January 20, 2004 Page 28
30 Basic Vectors number of minterms = 2 t = 8 basicvector m r (0,0,0) (1,0,0,0,0,0,0,0) = b 1 (1,0,0) (0,1,0,0,0,0,0,0) = b 2 (0,1,0) (0,0,1,0,0,0,0,0) = b 3 (1,1,0) (0,0,0,1,0,0,0,0) = b 4 (0,0,1) (0,0,0,0,1,0,0,0) = b 5 (1,0,1) (0,0,0,0,0,1,0,0) = b 6 (0,1,1) (0,0,0,0,0,0,1,0) = b 7 (1,1,1) (0,0,0,0,0,0,0,1) = b 8 Final term vectors will be linear combination of basic vectors according to minterm state and term occurence. January 20, 2004 Page 29
31 Generating term vectors k i = r,g i mr=1 c i,r m r r,gi (mr)=1 c2 i, j m r active c 1,r c 2,r c 3,r (0, 0, 0) NO (1,0,0) Y ES (0,1,0) Y ES (1,1,0) Y ES (0, 0, 1) NO (1, 0, 1) NO (0,1,1) Y ES (1, 1, 1) NO b 2, b 3, b 4 and b 7 have c i,r > 0 k 1 = b b 4 k 2 = 2 3 b b b 7 k 3 = 1 b 7 January 20, 2004 Page 30
32 Query matching Matching queries Let s say we have a query: k 1,k 3 Documents are represented by linear combinations of term vectors d 1 = k 1 d 2 = k 2 d 3 = k 1 + k 2 d 4 = k 2 + k 3 d 5 = k 1 + k 2 d 6 = k2 The query is k 1 + k 2 Use cosine similarity measure: sim(q,d i ) = q d j q d j January 20, 2004 Page 31
33 Query matching We know that: k 1 = b b 2 4 so: k 1 = (0.447,0,0, ,0,0,0,0) k 2 = 2 3 b b b 7 k 2 = (0,0,0.666,0.666,0,0,0.333,0) k 3 = b 7 = (0,0,0,0,0,0,1,0) Document vectors are linear combinations of k 1,k 2,k 3 : d 1 = (0.447,0,0,0.894,0,0,0,0) d 2 = (0,0,0.666,0.666,0,0,0.333,0) d 3 = (0.447,0,0.666,1.561,0,0,0.333,0) d 4 = (0,0,0.666,0.666,0,0,1.333,0) d 5 = (0.447,0,0.666,1.561,0,0,0.333,0) d 6 = (0,0,0.666,0.666,0,0,0.333,0) Query vector: q = k 1 + k 2 = ( ,0,0,0.894,0,0,1,0 January 20, 2004 Page 32
34 Similarities sim(q,d i ) = sim(q, d1) = sim(q, d2) = sim(q, d3) = sim(q, d4) = sim(q, d5) = sim(q, d6) = q d j q d j January 20, 2004 Page 33
35 Comparison to Classic Vector Space model We use binary term-document matrix: d 1 d 2 d 3 d 4 d 5 d 6 k k k Query vector for term 1 and term 2: q = (1,1,0) We simply multiply q t with term-document vector: m t q = q r q t r = (1,1,2,1,2,1) Now divide entires of q t r by products of document and query vector norms, p: p = (1.414,1.414,2,2,2,1.414) January 20, 2004 Page 34
36 Comparison to Classic Vector Space model Resulting Ranking Compared to GVS: i GV S CV S d d d d d d Inclusion of term dependencies makes significant difference January 20, 2004 Page 35
37 Evaluation of Generalized Vector Space Model 1. Comparison to other models: (a) Little evidence that model outperforms existing models (b) Problematic to interprete keyterm dependencies (c) Difficult to interprete specific ranking 2. Large collections (a) Very large number of minterms (b) Considerable computational overhead (c) Warranted effort? 3. Implementations (a) Few systems (b) Little empirical basis for evaluation 4. Benefits (a) Theoretical notion of keyterm dependence is exploited (b) Improved rankings (?) (c) Relatively simple extension of CVS January 20, 2004 Page 36
38 Keyterms and Concepts January 20, 2004 Page 37
39 Exploiting Term dependencies 1. Neural network model (a) imitate capacity of human brain to process information (b) Feedforward neural network (c) Input layer: terms (d) Output layer: documents (e) Activation of term nodes propagates to documents (f) Two-way communication 2. Latent Semantic Indexing (a) Create lower-dimensional space of concepts (b) Based on Singular Value Decomposition of term-document matrix (c) Lower-rank approximation (d) Project query vector into lower-dimensional space (e) Translate back to documents January 20, 2004 Page 38
40 Neural Network model 1. Biological nervous systems (a) Parallel computation on massive scale i. Billions of neurons ii. Complex electro-chemical interactions iii. Adaptivity to outside stimuli (b) No CPU, von Neuman absent (c) Speed in recognition and knowledge processing tasks is excellent January 20, 2004 Page 39
41 Human Brain January 20, 2004 Page 40
42 Neural Network model 1. Accomplished by: (a) Layers of connected neurons (b) Neuron s cell membrane is semi-permeable to Na +, K + and Cl ions (c) Can be depolarized by chemical and electrical stimuli (d) Depolarization produces spike or activation level: action potential (e) Spike is communicated to other neurons (FM modulation) January 20, 2004 Page 41
43 Action Potential January 20, 2004 Page 42
44 Neural networks 1. Simulations: (a) Artificial Neural Networks: simplified representation (b) Directed, weighted graphs (c) Nodes = neurons (d) Nodes have activation levels (e) Activation of node = weighted sum of connecting nodes activation levels 2. Applied in: (a) Image and Speech Recognition (b) Adaptive Control Systems (c) Models of human behavior and learning 3. Connectionist theory: (a) Learning to relate stimulus-response activation patterns (b) Two learning paradigms i. Supervized ii. Unsupervized (c) Implementation of specific learning algorithms (d) Structure of networks: i. Recurrent (e.g. SOM) ii. Feedforward (perceptron) January 20, 2004 Page 43
45 Neural network model January 20, 2004 Page 44
46 Neural network model January 20, 2004 Page 45
47 Retrieval: activation propagation from input to document layer 1. Retrieval: (a) Activation levels set of query term layer (b) Activation spreads to document term layer (c) Modulation by query term weights (d) Activation propagates to document layer (e) Modulation by term weights in documents (f) Essentials of VSM 2. Iterative Procedure (a) Activation can move from document layer to document term layer (b) Casts document activation back in term layer and back, etc (c) Activation will wane (weights < 0) January 20, 2004 Page 46
48 Query term, document term weights start at query term layer All activated query nodes: activation=1 (maximum) Use of normal vector ( space model query term weights: w i,q = freq ) i,q max l freq log n N l,q i but normalized: w w i,q = i,q t i=1 w2 i,q Transmission of activation values from document terms to document layer: document term weights: w i, j = f i, j log n N i normalization of weights: w w i, j = i, j t i=1 w2 i, j Ranking: t i=1 w i,qw i, j And again! January 20, 2004 Page 47
49 NN model discussion 1. NN: interesting concept (a) Concept of parallel, distributed computation (b) Conceptually close to VSM (c) Recursive nature allows network to fine-tune ranking 2. Evaluation (a) Few applications (b) Mostly theoretical contribution (c) Refinements have been formulated 3. Future: (a) Activation propagation in document networks (b) Remove keyterm layer (c) Problem: creation of large document networks January 20, 2004 Page 48
Lecture 2: IR Models. Johan Bollen Old Dominion University Department of Computer Science
Lecture 2: IR Models. Johan Bollen Old Dominion University Department of Computer Science http://www.cs.odu.edu/ jbollen January 30, 2003 Page 1 Structure 1. IR formal characterization (a) Mathematical
More informationModern Information Retrieval
Modern Information Retrieval Chapter 3 Modeling Introduction to IR Models Basic Concepts The Boolean Model Term Weighting The Vector Model Probabilistic Model Retrieval Evaluation, Modern Information Retrieval,
More informationInformation Retrieval Basic IR models. Luca Bondi
Basic IR models Luca Bondi Previously on IR 2 d j q i IRM SC q i, d j IRM D, Q, R q i, d j d j = w 1,j, w 2,j,, w M,j T w i,j = 0 if term t i does not appear in document d j w i,j and w i:1,j assumed to
More informationInformation Retrieval and Web Search
Information Retrieval and Web Search IR models: Vector Space Model IR Models Set Theoretic Classic Models Fuzzy Extended Boolean U s e r T a s k Retrieval: Adhoc Filtering Brosing boolean vector probabilistic
More informationLatent Semantic Analysis. Hongning Wang
Latent Semantic Analysis Hongning Wang CS@UVa Recap: vector space model Represent both doc and query by concept vectors Each concept defines one dimension K concepts define a high-dimensional space Element
More informationInference of A Minimum Size Boolean Function by Using A New Efficient Branch-and-Bound Approach From Examples
Published in: Journal of Global Optimization, 5, pp. 69-9, 199. Inference of A Minimum Size Boolean Function by Using A New Efficient Branch-and-Bound Approach From Examples Evangelos Triantaphyllou Assistant
More informationIR Models: The Probabilistic Model. Lecture 8
IR Models: The Probabilistic Model Lecture 8 ' * ) ( % $ $ +#! "#! '& & Probability of Relevance? ' ', IR is an uncertain process Information need to query Documents to index terms Query terms and index
More informationChap 2: Classical models for information retrieval
Chap 2: Classical models for information retrieval Jean-Pierre Chevallet & Philippe Mulhem LIG-MRIM Sept 2016 Jean-Pierre Chevallet & Philippe Mulhem Models of IR 1 / 81 Outline Basic IR Models 1 Basic
More informationLatent Semantic Analysis. Hongning Wang
Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:
More informationArtificial Neural Network
Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation
More informationCAIM: Cerca i Anàlisi d Informació Massiva
1 / 21 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim
More informationLearning Query and Document Similarities from Click-through Bipartite Graph with Metadata
Learning Query and Document Similarities from Click-through Bipartite Graph with Metadata Wei Wu a, Hang Li b, Jun Xu b a Department of Probability and Statistics, Peking University b Microsoft Research
More informationLearning Query and Document Similarities from Click-through Bipartite Graph with Metadata
Learning Query and Document Similarities from Click-through Bipartite Graph with Metadata ABSTRACT Wei Wu Microsoft Research Asia No 5, Danling Street, Haidian District Beiing, China, 100080 wuwei@microsoft.com
More informationNatural Language Processing. Topics in Information Retrieval. Updated 5/10
Natural Language Processing Topics in Information Retrieval Updated 5/10 Outline Introduction to IR Design features of IR systems Evaluation measures The vector space model Latent semantic indexing Background
More informationCan Vector Space Bases Model Context?
Can Vector Space Bases Model Context? Massimo Melucci University of Padua Department of Information Engineering Via Gradenigo, 6/a 35031 Padova Italy melo@dei.unipd.it Abstract Current Information Retrieval
More informationManning & Schuetze, FSNLP (c) 1999,2000
558 15 Topics in Information Retrieval (15.10) y 4 3 2 1 0 0 1 2 3 4 5 6 7 8 Figure 15.7 An example of linear regression. The line y = 0.25x + 1 is the best least-squares fit for the four points (1,1),
More informationFinancial Informatics IX: Fuzzy Sets
Financial Informatics IX: Fuzzy Sets Khurshid Ahmad, Professor of Computer Science, Department of Computer Science Trinity College, Dublin-2, IRELAND November 19th, 2008 https://www.cs.tcd.ie/khurshid.ahmad/teaching.html
More informationChapter 9: The Perceptron
Chapter 9: The Perceptron 9.1 INTRODUCTION At this point in the book, we have completed all of the exercises that we are going to do with the James program. These exercises have shown that distributed
More informationLecture 1: Topics in Information Retrieval. Johan Bollen Old Dominion University Department of Computer Science
Lecture 1: Topics in Information Retrieval. Johan Bollen Old Dominion University Department of Computer Science jbollen@cs.odu.edu http://www.cs.odu.edu/ jbollen January 16, 2003 Page 1 Structure 1. Class
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationA FUZZY LINGUISTIC IRS MODEL BASED ON A 2-TUPLE FUZZY LINGUISTIC APPROACH
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems Vol. 15, No. (007) 5 50 c World Scientific Publishing Company A FUZZY LINGUISTIC IRS MODEL BASED ON A -TUPLE FUZZY LINGUISTIC
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)
More informationManning & Schuetze, FSNLP, (c)
page 554 554 15 Topics in Information Retrieval co-occurrence Latent Semantic Indexing Term 1 Term 2 Term 3 Term 4 Query user interface Document 1 user interface HCI interaction Document 2 HCI interaction
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationSPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks
Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension
More informationINFLUENCE OF NORMS ON THE RESULTS OF QUERY FOR INFORMATION RETRIEVAL MECHANISMS BASED ON THESAURUS. Tanja Sekulić and Petar Hotomski
FACTA UNIVERSITATIS (NIŠ) Ser. Math. Inform. Vol. 22, No. 2 (2007), pp. 189 199 INFLUENCE OF NORMS ON THE RESULTS OF QUERY FOR INFORMATION RETRIEVAL MECHANISMS BASED ON THESAURUS Tanja Sekulić and Petar
More informationFuzzy Logic and Computing with Words. Ning Xiong. School of Innovation, Design, and Engineering Mälardalen University. Motivations
/3/22 Fuzzy Logic and Computing with Words Ning Xiong School of Innovation, Design, and Engineering Mälardalen University Motivations Human centric intelligent systems is a hot trend in current research,
More information9 Searching the Internet with the SVD
9 Searching the Internet with the SVD 9.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this
More informationIn biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.
In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In humans, association is known to be a prominent feature of memory.
More informationRule-Based Fuzzy Model
In rule-based fuzzy systems, the relationships between variables are represented by means of fuzzy if then rules of the following general form: Ifantecedent proposition then consequent proposition The
More informationLecture 5: Web Searching using the SVD
Lecture 5: Web Searching using the SVD Information Retrieval Over the last 2 years the number of internet users has grown exponentially with time; see Figure. Trying to extract information from this exponentially
More informationInterleaved Alldifferent Constraints: CSP vs. SAT Approaches
Interleaved Alldifferent Constraints: CSP vs. SAT Approaches Frédéric Lardeux 3, Eric Monfroy 1,2, and Frédéric Saubion 3 1 Universidad Técnica Federico Santa María, Valparaíso, Chile 2 LINA, Université
More informationSeminar Introduction to Fuzzy Logic I
Seminar Introduction to Fuzzy Logic I Itziar García-Honrado European Centre for Soft-Computing Mieres (Asturias) Spain 05/04/2011 Itziar García-Honrado (ECSC) Fuzzy Logic 05/04/2011 1 / 50 Figure: Crisp
More information13 Searching the Web with the SVD
13 Searching the Web with the SVD 13.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this
More informationFall CS646: Information Retrieval. Lecture 6 Boolean Search and Vector Space Model. Jiepu Jiang University of Massachusetts Amherst 2016/09/26
Fall 2016 CS646: Information Retrieval Lecture 6 Boolean Search and Vector Space Model Jiepu Jiang University of Massachusetts Amherst 2016/09/26 Outline Today Boolean Retrieval Vector Space Model Latent
More informationVariable Latent Semantic Indexing
Variable Latent Semantic Indexing Prabhakar Raghavan Yahoo! Research Sunnyvale, CA November 2005 Joint work with A. Dasgupta, R. Kumar, A. Tomkins. Yahoo! Research. Outline 1 Introduction 2 Background
More informationInformation Retrieval
Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices
More informationMatrices, Vector Spaces, and Information Retrieval
Matrices, Vector Spaces, and Information Authors: M. W. Berry and Z. Drmac and E. R. Jessup SIAM 1999: Society for Industrial and Applied Mathematics Speaker: Mattia Parigiani 1 Introduction Large volumes
More informationMachine Learning. Neural Networks
Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE
More informationFuzzy Systems. Introduction
Fuzzy Systems Introduction Prof. Dr. Rudolf Kruse Christian Moewes {kruse,cmoewes}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge
More informationOn the Foundations of Diverse Information Retrieval. Scott Sanner, Kar Wai Lim, Shengbo Guo, Thore Graepel, Sarvnaz Karimi, Sadegh Kharazmi
On the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai Lim, Shengbo Guo, Thore Graepel, Sarvnaz Karimi, Sadegh Kharazmi 1 Outline Need for diversity The answer: MMR But what was the
More informationCS 572: Information Retrieval
CS 572: Information Retrieval Lecture 11: Topic Models Acknowledgments: Some slides were adapted from Chris Manning, and from Thomas Hoffman 1 Plan for next few weeks Project 1: done (submit by Friday).
More informationArtificial Neural Networks
Artificial Neural Networks CPSC 533 Winter 2 Christian Jacob Neural Networks in the Context of AI Systems Neural Networks as Mediators between Symbolic AI and Statistical Methods 2 5.-NeuralNets-2.nb Neural
More informationCS47300: Web Information Search and Management
CS47300: Web Information Search and Management Prof. Chris Clifton 6 September 2017 Material adapted from course created by Dr. Luo Si, now leading Alibaba research group 1 Vector Space Model Disadvantages:
More informationConvolutional Associative Memory: FIR Filter Model of Synapse
Convolutional Associative Memory: FIR Filter Model of Synapse Rama Murthy Garimella 1, Sai Dileep Munugoti 2, Anil Rayala 1 1 International Institute of Information technology, Hyderabad, India. rammurthy@iiit.ac.in,
More information22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1
Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Fault-tolerant Reliable
More informationRETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic
More informationNeural Networks: Basics. Darrell Whitley Colorado State University
Neural Networks: Basics Darrell Whitley Colorado State University In the Beginning: The Perceptron X1 W W 1,1 1,2 X2 W W 2,1 2,2 W source, destination In the Beginning: The Perceptron The Perceptron Learning
More informationLinear Algebra and Eigenproblems
Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details
More informationUPPER AND LOWER SET FORMULAS: RESTRICTION AND MODIFICATION OF THE DEMPSTER-PAWLAK FORMALISM
Int. J. Appl. Math. Comput. Sci., 2002, Vol.12, No.3, 359 369 UPPER AND LOWER SET FORMULAS: RESTRICTION AND MODIFICATION OF THE DEMPSTER-PAWLAK FORMALISM ISMAIL BURHAN TÜRKŞEN Knowledge/Intelligence Systems
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationDISTRIBUTIONAL SEMANTICS
COMP90042 LECTURE 4 DISTRIBUTIONAL SEMANTICS LEXICAL DATABASES - PROBLEMS Manually constructed Expensive Human annotation can be biased and noisy Language is dynamic New words: slangs, terminology, etc.
More informationIntroduction and Perceptron Learning
Artificial Neural Networks Introduction and Perceptron Learning CPSC 565 Winter 2003 Christian Jacob Department of Computer Science University of Calgary Canada CPSC 565 - Winter 2003 - Emergent Computing
More informationArtificial Neural Network and Fuzzy Logic
Artificial Neural Network and Fuzzy Logic 1 Syllabus 2 Syllabus 3 Books 1. Artificial Neural Networks by B. Yagnanarayan, PHI - (Cover Topologies part of unit 1 and All part of Unit 2) 2. Neural Networks
More informationInformation Retrieval. Lecture 6
Information Retrieval Lecture 6 Recap of the last lecture Parametric and field searches Zones in documents Scoring documents: zone weighting Index support for scoring tf idf and vector spaces This lecture
More informationNeural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron
More informationVector Space Scoring Introduction to Information Retrieval INF 141 Donald J. Patterson
Vector Space Scoring Introduction to Information Retrieval INF 141 Donald J. Patterson Content adapted from Hinrich Schütze http://www.informationretrieval.org Querying Corpus-wide statistics Querying
More informationFuzzy Systems. Introduction
Fuzzy Systems Introduction Prof. Dr. Rudolf Kruse Christoph Doell {kruse,doell}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge Processing
More informationTowards Collaborative Information Retrieval
Towards Collaborative Information Retrieval Markus Junker, Armin Hust, and Stefan Klink German Research Center for Artificial Intelligence (DFKI GmbH), P.O. Box 28, 6768 Kaiserslautern, Germany {markus.junker,
More informationLatent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze
Latent Semantic Models Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Vector Space Model: Pros Automatic selection of index terms Partial matching of queries
More informationApplied Logic. Lecture 3 part 1 - Fuzzy logic. Marcin Szczuka. Institute of Informatics, The University of Warsaw
Applied Logic Lecture 3 part 1 - Fuzzy logic Marcin Szczuka Institute of Informatics, The University of Warsaw Monographic lecture, Spring semester 2017/2018 Marcin Szczuka (MIMUW) Applied Logic 2018 1
More informationME 534. Mechanical Engineering University of Gaziantep. Dr. A. Tolga Bozdana Assistant Professor
ME 534 Intelligent Manufacturing Systems Chp 4 Fuzzy Logic Mechanical Engineering University of Gaziantep Dr. A. Tolga Bozdana Assistant Professor Motivation and Definition Fuzzy Logic was initiated by
More informationOutline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting
Outline for today Information Retrieval Efficient Scoring and Ranking Recap on ranked retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University Efficient
More informationNeural Networks DWML, /25
DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 4-0 5 Scene recognition time 0. sec 00 inference
More informationCOMP304 Introduction to Neural Networks based on slides by:
COMP34 Introduction to Neural Networks based on slides by: Christian Borgelt http://www.borgelt.net/ Christian Borgelt Introduction to Neural Networks Motivation: Why (Artificial) Neural Networks? (Neuro-)Biology
More informationUSING SAT FOR COMBINATIONAL IMPLEMENTATION CHECKING. Liudmila Cheremisinova, Dmitry Novikov
International Book Series "Information Science and Computing" 203 USING SAT FOR COMBINATIONAL IMPLEMENTATION CHECKING Liudmila Cheremisinova, Dmitry Novikov Abstract. The problem of checking whether a
More informationINTRODUCTION TO ARTIFICIAL INTELLIGENCE
v=1 v= 1 v= 1 v= 1 v= 1 v=1 optima 2) 3) 5) 6) 7) 8) 9) 12) 11) 13) INTRDUCTIN T ARTIFICIAL INTELLIGENCE DATA15001 EPISDE 8: NEURAL NETWRKS TDAY S MENU 1. NEURAL CMPUTATIN 2. FEEDFRWARD NETWRKS (PERCEPTRN)
More informationPart I: Web Structure Mining Chapter 1: Information Retrieval and Web Search
Part I: Web Structure Mining Chapter : Information Retrieval an Web Search The Web Challenges Crawling the Web Inexing an Keywor Search Evaluating Search Quality Similarity Search The Web Challenges Tim
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 18: Latent Semantic Indexing Hinrich Schütze Center for Information and Language Processing, University of Munich 2013-07-10 1/43
More informationLecture 7 Artificial neural networks: Supervised learning
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationFuzzy Rules and Fuzzy Reasoning (chapter 3)
Fuzzy ules and Fuzzy easoning (chapter 3) Kai Goebel, Bill Cheetham GE Corporate esearch & Development goebel@cs.rpi.edu cheetham@cs.rpi.edu (adapted from slides by. Jang) Fuzzy easoning: The Big Picture
More informationPropositions. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 5.1, Page 1
Propositions An interpretation is an assignment of values to all variables. A model is an interpretation that satisfies the constraints. Often we don t want to just find a model, but want to know what
More informationIntroduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa
Introduction to Search Engine Technology Introduction to Link Structure Analysis Ronny Lempel Yahoo Labs, Haifa Outline Anchor-text indexing Mathematical Background Motivation for link structure analysis
More informationThe Perceptron. Volker Tresp Summer 2016
The Perceptron Volker Tresp Summer 2016 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationArtificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!
Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error
More informationAn Evolution Strategy for the Induction of Fuzzy Finite-state Automata
Journal of Mathematics and Statistics 2 (2): 386-390, 2006 ISSN 1549-3644 Science Publications, 2006 An Evolution Strategy for the Induction of Fuzzy Finite-state Automata 1,2 Mozhiwen and 1 Wanmin 1 College
More informationCS276A Text Information Retrieval, Mining, and Exploitation. Lecture 4 15 Oct 2002
CS276A Text Information Retrieval, Mining, and Exploitation Lecture 4 15 Oct 2002 Recap of last time Index size Index construction techniques Dynamic indices Real world considerations 2 Back of the envelope
More informationLatent Dirichlet Allocation Introduction/Overview
Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models
More informationImproved Algorithms for Module Extraction and Atomic Decomposition
Improved Algorithms for Module Extraction and Atomic Decomposition Dmitry Tsarkov tsarkov@cs.man.ac.uk School of Computer Science The University of Manchester Manchester, UK Abstract. In recent years modules
More informationInvestigation of Latent Semantic Analysis for Clustering of Czech News Articles
Investigation of Latent Semantic Analysis for Clustering of Czech News Articles Michal Rott, Petr Červa Laboratory of Computer Speech Processing 4. 9. 2014 Introduction Idea of article clustering Presumptions:
More informationCSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides
CSE 494/598 Lecture-4: Correlation Analysis LYDIA MANIKONDA HT TP://WWW.PUBLIC.ASU.EDU/~LMANIKON / **Content adapted from last year s slides Announcements Project-1 Due: February 12 th 2016 Analysis report:
More informationPart 8: Neural Networks
METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural
More informationA Zadeh-Norm Fuzzy Description Logic for Handling Uncertainty: Reasoning Algorithms and the Reasoning System
1 / 31 A Zadeh-Norm Fuzzy Description Logic for Handling Uncertainty: Reasoning Algorithms and the Reasoning System Judy Zhao 1, Harold Boley 2, Weichang Du 1 1. Faculty of Computer Science, University
More informationCSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer
CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene
More informationFinancial Informatics XI: Fuzzy Rule-based Systems
Financial Informatics XI: Fuzzy Rule-based Systems Khurshid Ahmad, Professor of Computer Science, Department of Computer Science Trinity College, Dublin-2, IRELAND November 19 th, 28. https://www.cs.tcd.ie/khurshid.ahmad/teaching.html
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationDRAFT CONCEPTUAL SOLUTION REPORT DRAFT
BASIC STRUCTURAL MODELING PROJECT Joseph J. Simpson Mary J. Simpson 08-12-2013 DRAFT CONCEPTUAL SOLUTION REPORT DRAFT Version 0.11 Page 1 of 18 Table of Contents Introduction Conceptual Solution Context
More informationGeneric Text Summarization
June 27, 2012 Outline Introduction 1 Introduction Notation and Terminology 2 3 4 5 6 Text Summarization Introduction Notation and Terminology Two Types of Text Summarization Query-Relevant Summarization:
More informationAssignment 3. Latent Semantic Indexing
Assignment 3 Gagan Bansal 2003CS10162 Group 2 Pawan Jain 2003CS10177 Group 1 Latent Semantic Indexing OVERVIEW LATENT SEMANTIC INDEXING (LSI) considers documents that have many words in common to be semantically
More informationUSING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING
POZNAN UNIVE RSIY OF E CHNOLOGY ACADE MIC JOURNALS No. 80 Electrical Engineering 2014 Hussam D. ABDULLA* Abdella S. ABDELRAHMAN* Vaclav SNASEL* USING SINGULAR VALUE DECOMPOSIION (SVD) AS A SOLUION FOR
More informationARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92
ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000
More informationThe Perceptron. Volker Tresp Summer 2014
The Perceptron Volker Tresp Summer 2014 1 Introduction One of the first serious learning machines Most important elements in learning tasks Collection and preprocessing of training data Definition of a
More informationData Mining Techniques
Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationTecniche di Verifica. Introduction to Propositional Logic
Tecniche di Verifica Introduction to Propositional Logic 1 Logic A formal logic is defined by its syntax and semantics. Syntax An alphabet is a set of symbols. A finite sequence of these symbols is called
More information