Lesson4: Descriptive Modelling of Similarity of Text Unit3: Vector space models for similarity

Size: px
Start display at page:

Download "Lesson4: Descriptive Modelling of Similarity of Text Unit3: Vector space models for similarity"

Transcription

1 Lesson4: Descriptive Modelling of Similarity of Text Unit3: Vector space models for similarity Rene Pickhardt Introduction to Web Science Part 2 Emerging Web Properties CC-BY-SA Rene Institute Pickhardt for Web Science Web and Science Technologies Part2 3 Ways University to study the of Web Koblenz-Landau, Germany 22

2 Completing this unit you should Be familiar with the the vector space model for text documents Be aware of term frequency and (inverse) document frequency Have reviewed the definitions of base and dimension Realize that the angle between two vectors can be seen as a similarity measure CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 23

3 A model based on vector spaces 1) Model documents as vectors of words Vector space Model CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 24

4 A model based on vector spaces 1) Model documents as vectors of words Vector space Model 2) calculating Distance between vectors CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 25

5 A model based on vector spaces 1) Model documents as vectors of words 3) interpreting 2) calculating Vector space Model CC-BY-SA Rene Pickhardt Distance between vectors Web Science Part2 3 Ways to study the Web 26

6 A model based on vector spaces Be able to make statements about similarity of documents 1) Model documents as vectors of words 3) interpreting 2) calculating Vector space Model CC-BY-SA Rene Pickhardt Distance between vectors Web Science Part2 3 Ways to study the Web 27

7 Pay close attention to notation Words W = {w 1,w 2,...,w n } w i D j Word Vectors V =< ~w 1, ~w 2,..., ~w n > ~w i ~ dj Document D j 2 W are a sequence of words So D j = w i1 w i2...w im has a length of m Document vector ~d j = nx i=1 tf(w i,d j ) ~w i CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 28

8 Usually tf-idf is considered instead of tf! The document frequency is defined as df (w i )= {D j w i ind j } Inverse document frequency is defined as idf(w i ) = log D resulting in df (w i ) tfidf(w i,d j )=tf(w i,d j ) log D df (w i ) In the videos and slides for simplicity of numbers we will only use the term frequency CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 29

9 Example (generic language) Let us assume 3 documents D 1 = a a a b b a a b a b D 2 = a b b D 3 = a a b In our artificial language we have just two words a and b Which documents are similar to each other? CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 30

10 Choose a vector space and base Let ~a = V =< ~a, ~ b> be the vector space spanned by the words a and b 1 0 is a base vector for word a Similarly for word b we have ~ 0 b = 1 w i D j ~w i ~ dj Chosing the base was a modelling choice! CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 31

11 Calculate the modelled document vectors d 1 D 1 = a a a b b a a b a b tf(a, D 1 )=6 tf(b, D 1 )=4 Let us now create the document vectors ~d 1 = 2X i=1 tf(w i,d 1 ) ~w i = tf(a, D 1 )~a + tf(b, D 1 ) ~ b =6 +4 = CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 32

12 Calculate the modelled document vectors d 2 D 2 = a b b tf(a, D 2 )=1 tf(b, D 2 )=2 Let us now create the document vectors ~d 2 = 2X i=1 tf(w i,d 2 ) ~w i = tf(a, D 2 )~a + tf(b, D 2 ) ~ b ~d 2 =1 +2 = CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 33

13 Calculate the modelled document vectors d 3 D 3 = a a b tf(a, D 3 )=2 tf(b, D 3 )=1 Let us now create the document vectors ~d 3 = 2X i=1 tf(w i,d 3 ) ~w i = tf(a, D 3 )~a + tf(b, D 3 ) ~ b ~d 3 =2 +1 = CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 34

14 Digression: Drawing the document vectors D 1 = a a a b b a a b a b, D 2 = a b b, D 3 = a a b ~d 1 = 6, d 4 ~ 1 2 =, d 2 ~ 2 3 = 1 CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 35

15 Which vectors are closest too each other? D 1 = a a a b b a a b a b, D 2 = a b b, D 3 = a a b ~d 1 = 6, d 4 ~ 1 2 =, d 2 ~ 2 3 = 1 CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 36

16 Two ways of calculating the distance between two vectors d i and d j? Euclidean distance Take a rule and measure Take difference: ~d = d ~ i dj ~ Calculate length of difference d ~ 2 Cosine distance Take the angle between vectors CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 37

17 Calculate Euclidean distance between two document vectors d i and d j Take the difference ~d = d ~ i dj ~ Calculate the length of the difference ~ d 2 = nx (d k ) 2 = k=1 k=1 nx (( d ~ i ) k ( d ~ j ) k ) 2 CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 38

18 Calculate Euclidean distance between two document vectors d i and d j Take the difference ~d = d ~ i dj ~ Calculate the length of the difference ~ d 2 = nx (d k ) 2 = k=1 k=1 nx (( d ~ i ) k ( d ~ j ) k ) 2 The k-th component of vector ( d ~ i ) k = tf(w k,d i ) by definition ~d i CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 39

19 Calculate Euclidean distance between two document vectors d i and d j Take the difference ~d = d ~ i dj ~ Calculate the length of the difference ~ d 2 = nx (d k ) 2 = k=1 = nx (( d ~ i ) k ( d ~ j ) k ) 2 k=1 nx (tf(w k,d i ) tf(w k,d j )) 2 k=1 w k D i D j For every word we compare how often it appears in document and document CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 40

20 Calculate Euclidean distance between two document vectors d i and d j Take the difference ~d = d ~ i dj ~ Calculate the length of the difference ~ d 2 = nx (d k ) 2 = k=1 = nx (( d ~ i ) k ( d ~ j ) k ) 2 k=1 nx (tf(w k,d i ) tf(w k,d j )) 2 k=1 w k D i D j For every word we compare how often it appears in document and document CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 41

21 Euclidean distances for our example D 1 = a a a b b a a b a b, D 2 = a b b, D 3 = a a b d ~ 1 d2 ~ e =5.4 d ~ 1 d3 ~ e =5 d ~ 2 d3 ~ e =1.4 CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 42

22 Calculate Cosine distance between two document vectors d i and d j Calculate the scalar product s =< d ~ i, d ~ j > < ~ d i, ~ d j >= = nx ( d ~ i ) k ( d ~ j ) k k=1 nx k=1 tf(w k,d i )tf(w k,d j ) Remember: ( d ~ i ) k = tf(w k,d i ) is zero most of the time. CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 43

23 Calculate Cosine distance between two document vectors d i and d j Calculate the scalar product Divide it by the product of lengths of both vectors (length as in Euclidean distance) cos( ) = < ~ d i, ~ d j > ~ d i ~ d j ) = cos 1 < d ~ i, d ~! j > d ~ i d ~ j s =< ~ d i, ~ d j > CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 44

24 Cosine distances for our example D 1 = a a a b b a a b a b, D 2 = a b b, D 3 = a a b ~ d 1 ~ d2 c =0.52 ~ d 1 ~ d3 c =0.12 ~ d 2 ~ d3 c =0.64 CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 45

25 Comparing cosine and Euclidean distance D 1 = a a a b b a a b a b, D 2 = a b b, D 3 = a a b Euklid Cosine ~ d 1 ~ d2 ~ d 2 ~ d ~ d 1 ~ d Different choices of metric can yield very different results Choice of metric is part of the model! Usually cosine distance is considered CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 46

26 Thank you for your attention! Contact: Rene Pickhardt Institute for Web Science and Technologies Universität Koblenz-Landau CC-BY-SA Rene Institute Pickhardt for Web Science Web and Science Technologies Part2 3 Ways University to study the of Web Koblenz-Landau, Germany 47

27 Copyright: This Slide deck is licensed under creative commons 3.0. share alike attribution license. It was created by Rene Pickhardt. You can use share and modify this slide deck as long as you attribute the author and keep the same license. By Alecmconroy (Own work) [GFDL ( or CC BY 3.0 ( creativecommons.org/licenses/by/3.0)], via Wikimedia Commons CC-BY-SA by CSTAR & Oleg Alexandrov CC-BY-SA Rene Pickhardt Web Science Part2 3 Ways to study the Web 48

Web Information Retrieval Dipl.-Inf. Christoph Carl Kling

Web Information Retrieval Dipl.-Inf. Christoph Carl Kling Institute for Web Science & Technologies University of Koblenz-Landau, Germany Web Information Retrieval Dipl.-Inf. Christoph Carl Kling Exercises WebIR ask questions! WebIR@c-kling.de 2 of 40 Probabilities

More information

Electron Arrangement

Electron Arrangement Electron Arrangement Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information

Retrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1

Retrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1 Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency Srihari: CSE 626 1 Text Retrieval Retrieval of text-based information is referred to as Information Retrieval (IR)

More information

Vector Space Model. Yufei Tao KAIST. March 5, Y. Tao, March 5, 2013 Vector Space Model

Vector Space Model. Yufei Tao KAIST. March 5, Y. Tao, March 5, 2013 Vector Space Model Vector Space Model Yufei Tao KAIST March 5, 2013 In this lecture, we will study a problem that is (very) fundamental in information retrieval, and must be tackled by all search engines. Let S be a set

More information

S-ID Measuring Variability in a Data Set

S-ID Measuring Variability in a Data Set S-ID Measuring Variability in a Data Set Task Jim has taken exams in his Statistics class. Exam Score 92 2 8 3 96 00 a. Both the MAD (mean absolute deviation) and the standard deviation are commonly used

More information

Ranking-II. Temporal Representation and Retrieval Models. Temporal Information Retrieval

Ranking-II. Temporal Representation and Retrieval Models. Temporal Information Retrieval Ranking-II Temporal Representation and Retrieval Models Temporal Information Retrieval Ranking in Information Retrieval Ranking documents important for information overload, quickly finding documents which

More information

Natural Language Processing. Topics in Information Retrieval. Updated 5/10

Natural Language Processing. Topics in Information Retrieval. Updated 5/10 Natural Language Processing Topics in Information Retrieval Updated 5/10 Outline Introduction to IR Design features of IR systems Evaluation measures The vector space model Latent semantic indexing Background

More information

Exercises * on Linear Algebra

Exercises * on Linear Algebra Exercises * on Linear Algebra Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 7 Contents Vector spaces 4. Definition...............................................

More information

Similarity Numerical measure of how alike two data objects are Value is higher when objects are more alike Often falls in the range [0,1]

Similarity Numerical measure of how alike two data objects are Value is higher when objects are more alike Often falls in the range [0,1] Similarity and Dissimilarity Similarity Numerical measure of how alike two data objects are Value is higher when objects are more alike Often falls in the range [0,1] Dissimilarity (e.g., distance) Numerical

More information

Solids, Liquids, Gases, and Plasmas

Solids, Liquids, Gases, and Plasmas Solids, Liquids, Gases, and Plasmas Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content,

More information

R n : The Cross Product

R n : The Cross Product A FIRST COURSE IN LINEAR ALGEBRA An Open Text by Ken Kuttler R n : The Cross Product Lecture Notes by Karen Seyffarth Adapted by LYRYX SERVICE COURSE SOLUTION Attribution-NonCommercial-ShareAlike (CC BY-NC-SA)

More information

Information Retrieval and Web Search

Information Retrieval and Web Search Information Retrieval and Web Search IR models: Vector Space Model IR Models Set Theoretic Classic Models Fuzzy Extended Boolean U s e r T a s k Retrieval: Adhoc Filtering Brosing boolean vector probabilistic

More information

Exercises * on Principal Component Analysis

Exercises * on Principal Component Analysis Exercises * on Principal Component Analysis Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 207 Contents Intuition 3. Problem statement..........................................

More information

Term Weighting and the Vector Space Model. borrowing from: Pandu Nayak and Prabhakar Raghavan

Term Weighting and the Vector Space Model. borrowing from: Pandu Nayak and Prabhakar Raghavan Term Weighting and the Vector Space Model borrowing from: Pandu Nayak and Prabhakar Raghavan IIR Sections 6.2 6.4.3 Ranked retrieval Scoring documents Term frequency Collection statistics Weighting schemes

More information

Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model

Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model Ranked retrieval Thus far, our queries have all been Boolean. Documents either

More information

Precalculus An Investigation of Functions

Precalculus An Investigation of Functions Precalculus An Investigation of Functions David Lippman Melonie Rasmussen Edition 1. This book is also available to read free online at http://www.opentextbookstore.com/precalc/ If you want a printed copy,

More information

Scoring, Term Weighting and the Vector Space

Scoring, Term Weighting and the Vector Space Scoring, Term Weighting and the Vector Space Model Francesco Ricci Most of these slides comes from the course: Information Retrieval and Web Search, Christopher Manning and Prabhakar Raghavan Content [J

More information

1/31/2017 (67) Powers of the imaginary unit What are the imaginary numbers? Introduction to complex numbers Algebra II Khan Academy

1/31/2017 (67) Powers of the imaginary unit What are the imaginary numbers? Introduction to complex numbers Algebra II Khan Academy Powers of the imaginary unit Learn how to simplify any power of the imaginary unit i. For example, simplify i²⁷ as -i. Share to Google Classroom Share Tweet Email We know that i = 1 and that i = 1. But

More information

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 6: Scoring, Term Weighting and the Vector Space Model This lecture; IIR Sections

More information

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2017 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)

More information

Vectors Summary. can slide along the line of action. not restricted, defined by magnitude & direction but can be anywhere.

Vectors Summary. can slide along the line of action. not restricted, defined by magnitude & direction but can be anywhere. Vectors Summary A vector includes magnitude (size) and direction. Academic Skills Advice Types of vectors: Line vector: Free vector: Position vector: Unit vector (n ): can slide along the line of action.

More information

CAIM: Cerca i Anàlisi d Informació Massiva

CAIM: Cerca i Anàlisi d Informació Massiva 1 / 21 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim

More information

AP Physics C - Mechanics

AP Physics C - Mechanics Slide 1 / 125 Slide 2 / 125 AP Physics C - Mechanics Work and Energy 2015-12-03 www.njctl.org Table of Contents Slide 3 / 125 Click on the topic to go to that section Energy and Work Conservative and Non-Conservative

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning and Prabhakar Raghavan Lecture 6: Scoring, Term Weighting and the Vector Space Model This lecture;

More information

Linear Algebra August 17, 2018

Linear Algebra August 17, 2018 Linear Algebra August 17, 2018 This book is the result of a collaborative effort of a community of people like you, who believe that knowledge only grows if shared. We are waiting for you! Get in touch

More information

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic

More information

Ranked IR. Lecture Objectives. Text Technologies for Data Science INFR Learn about Ranked IR. Implement: 10/10/2018. Instructor: Walid Magdy

Ranked IR. Lecture Objectives. Text Technologies for Data Science INFR Learn about Ranked IR. Implement: 10/10/2018. Instructor: Walid Magdy Text Technologies for Data Science INFR11145 Ranked IR Instructor: Walid Magdy 10-Oct-2018 Lecture Objectives Learn about Ranked IR TFIDF VSM SMART notation Implement: TFIDF 2 1 Boolean Retrieval Thus

More information

AP Physics C Mechanics

AP Physics C Mechanics 1 AP Physics C Mechanics Work and Energy 2015 12 03 www.njctl.org 2 Table of Contents Click on the topic to go to that section Energy and Work Conservative and Non Conservative Forces Conservation of Total

More information

CK-12 FOUNDATION. Separating Mixtures. Say Thanks to the Authors Click (No sign in required)

CK-12 FOUNDATION. Separating Mixtures. Say Thanks to the Authors Click   (No sign in required) CK-12 FOUNDATION Separating Mixtures Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) Forsythe Robinson To access a customizable version of this book, as well as other

More information

Inclined Planes. Say Thanks to the Authors Click (No sign in required)

Inclined Planes. Say Thanks to the Authors Click  (No sign in required) Inclined Planes Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information

Exercises * on Functions

Exercises * on Functions Exercises * on Functions Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 2 February 2017 Contents 1 Scalar functions in one variable 3 1.1 Elementary transformations.....................................

More information

Ranked IR. Lecture Objectives. Text Technologies for Data Science INFR Learn about Ranked IR. Implement: 10/10/2017. Instructor: Walid Magdy

Ranked IR. Lecture Objectives. Text Technologies for Data Science INFR Learn about Ranked IR. Implement: 10/10/2017. Instructor: Walid Magdy Text Technologies for Data Science INFR11145 Ranked IR Instructor: Walid Magdy 10-Oct-017 Lecture Objectives Learn about Ranked IR TFIDF VSM SMART notation Implement: TFIDF 1 Boolean Retrieval Thus far,

More information

The Geometry of R n. Supplemental Lecture Notes for Linear Algebra Courses at Georgia Tech

The Geometry of R n. Supplemental Lecture Notes for Linear Algebra Courses at Georgia Tech The Geometry of R n Supplemental Lecture Notes for Linear Algebra Courses at Georgia Tech Contents Vectors in R n. Vectors....................................... The Length and Direction of a Vector......................3

More information

TIEA311 Tietokonegrafiikan perusteet kevät 2018

TIEA311 Tietokonegrafiikan perusteet kevät 2018 TIEA311 Tietokonegrafiikan perusteet kevät 2018 ( Principles of Computer Graphics Spring 2018) Copyright and Fair Use Notice: The lecture videos of this course are made available for registered students

More information

Estimating the Selectivity of tf-idf based Cosine Similarity Predicates

Estimating the Selectivity of tf-idf based Cosine Similarity Predicates Estimating the Selectivity of tf-idf based Cosine Similarity Predicates Sandeep Tata Jignesh M. Patel Department of Electrical Engineering and Computer Science University of Michigan 22 Hayward Street,

More information

Vectors and their uses

Vectors and their uses Vectors and their uses Sharon Goldwater Institute for Language, Cognition and Computation School of Informatics, University of Edinburgh DRAFT Version 0.95: 3 Sep 2015. Do not redistribute without permission.

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval Lecture 6-2: The Vector Space Model Binary incidence matrix Anthony and Cleopatra Julius Caesar The Tempest Hamlet Othello Macbeth... ANTHONY BRUTUS CAESAR CALPURNIA

More information

Fall CS646: Information Retrieval. Lecture 6 Boolean Search and Vector Space Model. Jiepu Jiang University of Massachusetts Amherst 2016/09/26

Fall CS646: Information Retrieval. Lecture 6 Boolean Search and Vector Space Model. Jiepu Jiang University of Massachusetts Amherst 2016/09/26 Fall 2016 CS646: Information Retrieval Lecture 6 Boolean Search and Vector Space Model Jiepu Jiang University of Massachusetts Amherst 2016/09/26 Outline Today Boolean Retrieval Vector Space Model Latent

More information

Two-Column Proofs. Bill Zahner Lori Jordan. Say Thanks to the Authors Click (No sign in required)

Two-Column Proofs. Bill Zahner Lori Jordan. Say Thanks to the Authors Click   (No sign in required) Two-Column Proofs Bill Zahner Lori Jordan Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive

More information

Vectors (Trigonometry Explanation)

Vectors (Trigonometry Explanation) Vectors (Trigonometry Explanation) CK12 Editor Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive

More information

Math 216 First Midterm 6 February, 2017

Math 216 First Midterm 6 February, 2017 Math 216 First Midterm 6 February, 2017 This sample exam is provided to serve as one component of your studying for this exam in this course. Please note that it is not guaranteed to cover the material

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval Lecture 6-2: The Vector Space Model Outline The vector space model 2 Binary incidence matrix Anthony and Cleopatra Julius Caesar The Tempest Hamlet Othello Macbeth...

More information

Solutions to the Exercises * on Linear Algebra

Solutions to the Exercises * on Linear Algebra Solutions to the Exercises * on Linear Algebra Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 ebruary 7 Contents Vector spaces 4. Definition...............................................

More information

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)

More information

Complex Numbers CK-12. Say Thanks to the Authors Click (No sign in required)

Complex Numbers CK-12. Say Thanks to the Authors Click  (No sign in required) Complex Numbers CK-12 Say Thanks to the Authors Click http://www.ck12.org/saythanks No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information

CS276A Text Information Retrieval, Mining, and Exploitation. Lecture 4 15 Oct 2002

CS276A Text Information Retrieval, Mining, and Exploitation. Lecture 4 15 Oct 2002 CS276A Text Information Retrieval, Mining, and Exploitation Lecture 4 15 Oct 2002 Recap of last time Index size Index construction techniques Dynamic indices Real world considerations 2 Back of the envelope

More information

NOTES ON VECTORS, PLANES, AND LINES

NOTES ON VECTORS, PLANES, AND LINES NOTES ON VECTORS, PLANES, AND LINES DAVID BEN MCREYNOLDS 1. Vectors I assume that the reader is familiar with the basic notion of a vector. The important feature of the vector is that it has a magnitude

More information

Significant Figures. CK12 Editor. Say Thanks to the Authors Click (No sign in required)

Significant Figures. CK12 Editor. Say Thanks to the Authors Click  (No sign in required) Significant Figures CK12 Editor Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content,

More information

Go on to the next page.

Go on to the next page. Chapter 10: The Nature of Force Force a push or a pull Force is a vector (it has direction) just like velocity and acceleration Newton the SI unit for force = kg m/s 2 Net force the combination of all

More information

Polynomials. Eve Rawley, (EveR) Anne Gloag, (AnneG) Andrew Gloag, (AndrewG)

Polynomials. Eve Rawley, (EveR) Anne Gloag, (AnneG) Andrew Gloag, (AndrewG) Polynomials Eve Rawley, (EveR) Anne Gloag, (AnneG) Andrew Gloag, (AndrewG) Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book,

More information

Introduction to Algebraic and Geometric Topology Week 3

Introduction to Algebraic and Geometric Topology Week 3 Introduction to Algebraic and Geometric Topology Week 3 Domingo Toledo University of Utah Fall 2017 Lipschitz Maps I Recall f :(X, d)! (X 0, d 0 ) is Lipschitz iff 9C > 0 such that d 0 (f (x), f (y)) apple

More information

Distances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Distances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Distances and similarities Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Similarities Start with X which we assume is centered and standardized. The PCA loadings were

More information

8.EE The Intersection of Two

8.EE The Intersection of Two 8.EE The Intersection of Two Lines Alignments to Content Standards: 8.EE.C.8.a Task a. Draw the two lines that intersect only at the point. One of the lines should pass (0, ) through the point. b. Write

More information

Boolean and Vector Space Retrieval Models CS 290N Some of slides from R. Mooney (UTexas), J. Ghosh (UT ECE), D. Lee (USTHK).

Boolean and Vector Space Retrieval Models CS 290N Some of slides from R. Mooney (UTexas), J. Ghosh (UT ECE), D. Lee (USTHK). Boolean and Vector Space Retrieval Models 2013 CS 290N Some of slides from R. Mooney (UTexas), J. Ghosh (UT ECE), D. Lee (USTHK). 1 Table of Content Boolean model Statistical vector space model Retrieval

More information

WEEK 8. CURVE SKETCHING. 1. Concavity

WEEK 8. CURVE SKETCHING. 1. Concavity WEEK 8. CURVE SKETCHING. Concavity Definition. (Concavity). The graph of a function y = f(x) is () concave up on an interval I if for any two points a, b I, the straight line connecting two points (a,

More information

.. CSC 566 Advanced Data Mining Alexander Dekhtyar..

.. CSC 566 Advanced Data Mining Alexander Dekhtyar.. .. CSC 566 Advanced Data Mining Alexander Dekhtyar.. Information Retrieval Latent Semantic Indexing Preliminaries Vector Space Representation of Documents: TF-IDF Documents. A single text document is a

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

9/29/2014. Chapter 3 Kinematics in Two Dimensions; Vectors. 3-1 Vectors and Scalars. Contents of Chapter Addition of Vectors Graphical Methods

9/29/2014. Chapter 3 Kinematics in Two Dimensions; Vectors. 3-1 Vectors and Scalars. Contents of Chapter Addition of Vectors Graphical Methods Lecture PowerPoints Chapter 3 Physics: Principles with Applications, 7 th edition Giancoli Chapter 3 Kinematics in Two Dimensions; Vectors This work is protected by United States copyright laws and is

More information

16 The Information Retrieval "Data Model"

16 The Information Retrieval Data Model 16 The Information Retrieval "Data Model" 16.1 The general model Not presented in 16.2 Similarity the course! 16.3 Boolean Model Not relevant for exam. 16.4 Vector space Model 16.5 Implementation issues

More information

Note: Please use the actual date you accessed this material in your citation.

Note: Please use the actual date you accessed this material in your citation. MIT OpenCourseWare http://ocw.mit.edu 18.06 Linear Algebra, Spring 2005 Please use the following citation format: Gilbert Strang, 18.06 Linear Algebra, Spring 2005. (Massachusetts Institute of Technology:

More information

Solving Linear Inequalities: Introduction and Formatting (page 1 of 7)

Solving Linear Inequalities: Introduction and Formatting (page 1 of 7) Solving Linear Inequalities: Introduction and Formatting (page 1 of 7) Sections: Introduction and formatting, Elementary examples, Advanced examples Solving linear inequalities is almost exactly like solving

More information

MTH 306 Spring Term 2007

MTH 306 Spring Term 2007 MTH 306 Spring Term 2007 Lesson 5 John Lee Oregon State University (Oregon State University) 1 / 12 Lesson 5 Goals: Be able to de ne linear independence and linear dependence of a set of vectors (Oregon

More information

https://docs.google.com/forms/d/1yxovur5lfqzcz4gsxcijypdnltlo93f3zama5klmy_e/ viewform?usp=send_form

https://docs.google.com/forms/d/1yxovur5lfqzcz4gsxcijypdnltlo93f3zama5klmy_e/ viewform?usp=send_form AP Physics 1 Summer Assignment Welcome to AP Physics 1! It is a college level physics course that is fun, interesting, and challenging. This assignment will review all of the prerequisite knowledge expected

More information

4 Linear Algebra Review

4 Linear Algebra Review Linear Algebra Review For this topic we quickly review many key aspects of linear algebra that will be necessary for the remainder of the text 1 Vectors and Matrices For the context of data analysis, the

More information

PV211: Introduction to Information Retrieval

PV211: Introduction to Information Retrieval PV211: Introduction to Information Retrieval http://www.fi.muni.cz/~sojka/pv211 IIR 6: Scoring, term weighting, the vector space model Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics,

More information

CG Basics 1 of 10. Mathematical Foundations: Vectors, Matrices, & Parametric Equations

CG Basics 1 of 10. Mathematical Foundations: Vectors, Matrices, & Parametric Equations CG Basics 1 of 10 ematical Foundations: Vectors, Matrices, & Parametric Equations William H. Hsu Department of Computing and Information Sciences, KSU KSOL course page: http://bit.ly/hgvxlh Course web

More information

Solving Absolute Value Equations and Inequalities

Solving Absolute Value Equations and Inequalities Solving Absolute Value Equations and Inequalities Say Thanks to the Authors Click http://www.ck1.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive

More information

Digital logic signals

Digital logic signals Digital logic signals This worksheet and all related files are licensed under the Creative Commons Attribution License, version 1.0. To view a copy of this license, visit http://creativecommons.org/licenses/by/1.0/,

More information

Digital logic signals

Digital logic signals Digital logic signals This worksheet and all related files are licensed under the Creative Commons Attribution License, version 1.0. To view a copy of this license, visit http://creativecommons.org/licenses/by/1.0/,

More information

The Shape, Center and Spread of a Normal Distribution - Basic

The Shape, Center and Spread of a Normal Distribution - Basic The Shape, Center and Spread of a Normal Distribution - Basic Brenda Meery, (BrendaM) Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version

More information

Intermediate Algebra

Intermediate Algebra Intermediate Algebra Anne Gloag Andrew Gloag Mara Landers Remixed by James Sousa Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) www.ck12.org To access a customizable

More information

Area of Circles. Say Thanks to the Authors Click (No sign in required)

Area of Circles. Say Thanks to the Authors Click  (No sign in required) Area of Circles Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information

U.S. ARMY RESEARCH, DEVELOPMENT AND ENGINEERING COMMAND

U.S. ARMY RESEARCH, DEVELOPMENT AND ENGINEERING COMMAND U.S. ARMY RESEARCH, DEVELOPMENT AND ENGINEERING COMMAND Quantum Computing An Introduction to What it is, Why We Want it, and How We re Trying to Get it Dr. Sara Gamble Program Manager Quantum Information

More information

Term Weighting and Vector Space Model. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Term Weighting and Vector Space Model. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze Term Weighting and Vector Space Model Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Ranked retrieval Thus far, our queries have all been Boolean. Documents either

More information

Inverse Functions and Trigonometric Equations - Solution Key

Inverse Functions and Trigonometric Equations - Solution Key Inverse Functions and Trigonometric Equations - Solution Key CK Editor Say Thanks to the Authors Click http://www.ck.org/saythanks (No sign in required To access a customizable version of this book, as

More information

Inverse Functions. Say Thanks to the Authors Click (No sign in required)

Inverse Functions. Say Thanks to the Authors Click  (No sign in required) Inverse Functions Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information

Worksheet 1.3: Introduction to the Dot and Cross Products

Worksheet 1.3: Introduction to the Dot and Cross Products Boise State Math 275 (Ultman Worksheet 1.3: Introduction to the Dot and Cross Products From the Toolbox (what you need from previous classes Trigonometry: Sine and cosine functions. Vectors: Know what

More information

The Law of Cosines. Say Thanks to the Authors Click (No sign in required)

The Law of Cosines. Say Thanks to the Authors Click  (No sign in required) The Law of Cosines Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information

AP Physics C - Mechanics. Energy and Work. Slide 1 / 125 Slide 2 / 125. Slide 4 / 125. Slide 3 / 125. Slide 6 / 125. Slide 5 / 125.

AP Physics C - Mechanics. Energy and Work. Slide 1 / 125 Slide 2 / 125. Slide 4 / 125. Slide 3 / 125. Slide 6 / 125. Slide 5 / 125. Slide 1 / 125 Slide 2 / 125 AP Physics C - Mechanics Work and nergy 2015-12-03 www.njctl.org Slide 3 / 125 Slide 4 / 125 Table of Contents Click on the topic to go to that section nergy and Work Conservative

More information

ENR202 Mechanics of Materials Lecture 4A Notes and Slides

ENR202 Mechanics of Materials Lecture 4A Notes and Slides Slide 1 Copyright Notice Do not remove this notice. COMMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been produced and communicated to you by or on behalf of the University

More information

vector space retrieval many slides courtesy James Amherst

vector space retrieval many slides courtesy James Amherst vector space retrieval many slides courtesy James Allan@umass Amherst 1 what is a retrieval model? Model is an idealization or abstraction of an actual process Mathematical models are used to study the

More information

Vectors FOR SAILSIM B Y E LLERY C HAN, PRECISION L IGHTWORKS 18 MARCH 2017

Vectors FOR SAILSIM B Y E LLERY C HAN, PRECISION L IGHTWORKS 18 MARCH 2017 Vectors FOR SAILSIM B Y E LLERY C HAN, PRECISION L IGHTWORKS 18 MARCH 2017 Despicable Me! helped me with my math homework -- comment on an online forum Vectors Useful for representing values that have

More information

MITOCW MITRES_18-007_Part2_lec1_300k.mp4

MITOCW MITRES_18-007_Part2_lec1_300k.mp4 MITOCW MITRES_18-007_Part2_lec1_300k.mp4 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources

More information

This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication.

This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication. This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication. Copyright Pearson Canada Inc. All rights reserved. Copyright Pearson

More information

Reminder: Univariate Data. Bivariate Data. Example: Puppy Weights. You weigh the pups and get these results: 2.5, 3.5, 3.3, 3.1, 2.6, 3.6, 2.

Reminder: Univariate Data. Bivariate Data. Example: Puppy Weights. You weigh the pups and get these results: 2.5, 3.5, 3.3, 3.1, 2.6, 3.6, 2. TP: To review Standard Deviation, Residual Plots, and Correlation Coefficients HW: Do a journal entry on each of the calculator tricks in this lesson. Lesson slides will be posted with notes. Do Now: Write

More information

Vectors. Slide 2 / 36. Slide 1 / 36. Slide 3 / 36. Slide 4 / 36. Slide 5 / 36. Slide 6 / 36. Scalar versus Vector. Determining magnitude and direction

Vectors. Slide 2 / 36. Slide 1 / 36. Slide 3 / 36. Slide 4 / 36. Slide 5 / 36. Slide 6 / 36. Scalar versus Vector. Determining magnitude and direction Slide 1 / 3 Slide 2 / 3 Scalar versus Vector Vectors scalar has only a physical quantity such as mass, speed, and time. vector has both a magnitude and a direction associated with it, such as velocity

More information

1. Definition of a Polynomial

1. Definition of a Polynomial 1. Definition of a Polynomial What is a polynomial? A polynomial P(x) is an algebraic expression of the form Degree P(x) = a n x n + a n 1 x n 1 + a n 2 x n 2 + + a 3 x 3 + a 2 x 2 + a 1 x + a 0 Leading

More information

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Schütze s, linked from http://informationretrieval.org/ IR 5: Scoring, Term Weighting, The Vector Space Model II Paul Ginsparg Cornell

More information

Chapter 11 Evolution by Permutation

Chapter 11 Evolution by Permutation Chapter 11 Evolution by Permutation In what follows a very brief account of reversible evolution and, in particular, reversible computation by permutation will be presented We shall follow Mermin s account

More information

Dealing with Text Databases

Dealing with Text Databases Dealing with Text Databases Unstructured data Boolean queries Sparse matrix representation Inverted index Counts vs. frequencies Term frequency tf x idf term weights Documents as vectors Cosine similarity

More information

UNIT V: Multi-Dimensional Kinematics and Dynamics Page 1

UNIT V: Multi-Dimensional Kinematics and Dynamics Page 1 UNIT V: Multi-Dimensional Kinematics and Dynamics Page 1 UNIT V: Multi-Dimensional Kinematics and Dynamics As we have already discussed, the study of the rules of nature (a.k.a. Physics) involves both

More information

Math 3191 Applied Linear Algebra

Math 3191 Applied Linear Algebra Math 191 Applied Linear Algebra Lecture 1: Inner Products, Length, Orthogonality Stephen Billups University of Colorado at Denver Math 191Applied Linear Algebra p.1/ Motivation Not all linear systems have

More information

Inside the Atom. Say Thanks to the Authors Click (No sign in required)

Inside the Atom. Say Thanks to the Authors Click   (No sign in required) Inside the Atom Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information

University of Illinois at Urbana-Champaign. Midterm Examination

University of Illinois at Urbana-Champaign. Midterm Examination University of Illinois at Urbana-Champaign Midterm Examination CS410 Introduction to Text Information Systems Professor ChengXiang Zhai TA: Azadeh Shakery Time: 2:00 3:15pm, Mar. 14, 2007 Place: Room 1105,

More information

Electron Configuration and the Periodic Table C-SE-TE

Electron Configuration and the Periodic Table C-SE-TE Electron Configuration and the Periodic Table C-SE-TE Richard Parsons, (RichardP) Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of

More information

MITOCW MITRES18_006F10_26_0501_300k-mp4

MITOCW MITRES18_006F10_26_0501_300k-mp4 MITOCW MITRES18_006F10_26_0501_300k-mp4 ANNOUNCER: The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational

More information

Investigation of Latent Semantic Analysis for Clustering of Czech News Articles

Investigation of Latent Semantic Analysis for Clustering of Czech News Articles Investigation of Latent Semantic Analysis for Clustering of Czech News Articles Michal Rott, Petr Červa Laboratory of Computer Speech Processing 4. 9. 2014 Introduction Idea of article clustering Presumptions:

More information

1/11/2011. Chapter 4: Variability. Overview

1/11/2011. Chapter 4: Variability. Overview Chapter 4: Variability Overview In statistics, our goal is to measure the amount of variability for a particular set of scores, a distribution. In simple terms, if the scores in a distribution are all

More information

Inequalities. CK12 Editor. Say Thanks to the Authors Click (No sign in required)

Inequalities. CK12 Editor. Say Thanks to the Authors Click  (No sign in required) Inequalities CK12 Editor Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information