Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

Size: px
Start display at page:

Download "Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50"

Transcription

1 Estimation of the Click Volume by Large Scale Regression Analysis Yuri Lifshits, Dirk Nowotka [International Computer Science Symposium in Russia 07, Ekaterinburg] Presented by: Aditya Menon UCSD May 15, 2008 Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

2 Outline 1 Background: Sponsored search 2 Formal description: the history table 3 Estimating click volume with linear regression 4 Efficient sparse linear regression Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

3 The setting: online advertising Most web services rely on advertising as a source of income Choosing the right ad to display is important Try to tailor it to the customer An advertising engine is used to choose the ads Some software that dynamically extracts ads based on user requests Goal: Maximize the number of user clicks Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

4 How an advertising engine operates The advertising engine keeps a database of ads, and given a request, returns a set of relevant ads Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

5 Measuring usefulness: Clickthrough rate A natural quantity of interest is the clickthrough rate The clickthrough rate represents the fraction of times that an ad was clicked when it is displayed: Clickthrough rate(a) = # times ad a is clicked # times a is displayed A high click through rate signifies a well-chosen ad Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

6 Measuring usefulness: Clickthrough rate Theoretically, we can think of the clickthrough rate as being the probability of a click given some information about an ad: Theoretical clickthrough rate(a) = Pr(Click f(a)) where f(a) captures properties of an ad, surrounding pages, the viewer, etc. Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

7 Measuring usefulness: Click volume A related theoretical quantity is the click volume This is the number of times that an ad would be clicked, assuming it is shown in all requests We can think of a request as being a potential opportunity for display; the click volume is just a sum of clickthrough rates over all requests r: Click volume(a) = r Clickthrough rate(a, r) Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

8 Measuring usefulness: Click volume More theoretically, the click volume can be expressed Click volume(a) = r Pr(Click a, r) This can be thought of as an unweighted average Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

9 How would knowing the click volume help us? Intuitively, the click volume helps us answer many relevant questions The maintainer of the engine can estimate how many clicks an ad can expect to receive, and hence set the optimal price The volume of an ad can help determine the target audience for an ad We can use the volume to predict the purchase market for an ad etc. In short: it is a useful thing to estimate! So how do we do this? Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

10 Estimating click volume A natural question to ask is how we can estimate the click volume for an ad Specifically, given an ad a, assuming it is displayed at all requests, how many of those displays will result in a click? Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

11 Aside: is the assumption justified? We said that we assumed an ad is displayed on all requests; is such an assumption justified? If all means a random sample, then Click volume(a) = r Clickthrough rate(a, r) Pr(Click a) Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

12 Where this paper fits in This paper looks at how we can estimate click volume for a new ad, given the past history of clicks on different ads The key idea is to use linear regression on this history to predict the new click volume The computational constraints are that the data here is large-scale and sparse An algorithm is proposed that solves the regression problem efficiently, with guaranteed convergence Critique #1: There is no experimental testing of the algorithm Critique #2: There is no mention of other work on linear regression Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

13 Outline 1 Background: Sponsored search 2 Formal description: the history table 3 Estimating click volume with linear regression 4 Efficient sparse linear regression Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

14 Representing ads and requests The two important components of the formal model are ads and requests We represent an ad as a vector a A, each component indicating a particular feature (e.g. text, link, phone number,...): a = (a 1,..., a m ) We also represent a request as a vector r R, similarly representing a collection of features (e.g. IP address, referrer, query,...) r = (r 1,..., r n ) Note that these vectors may be sparse e.g. bag of words model Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

15 Introducing events We define an event to be the result of the interaction between an ad and a request We can think of an event e being a function of a, r, which we also represent by a vector: e(a, r) = (e 1,..., e p ) Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

16 Creating a history table We need some structure to help study the behaviour of ads and their requests A history table can be thought of as a collection of events, and whether or not they resulted in a click If we let b i {0, 1} denote a click, then HT = {(e(a 1, r 1 ), b 1 ),..., (e(a n, r n ), b n )} Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

17 Intuitive representation As the name suggests, we can represent a history table in a tabular form Let rows denote ads and columns ad requests Ad requests Ads Note that each cell denotes the b i s for all events formed by the pair (a, r) Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

18 Sparsity of table An important property of the history table, when viewed in matrix form, is that it is sparse along the columns The reason is that each column usually covers only a few cells: the ads for a given request are generated by the advertising engine It is assumed that if the size of the table is m n, there are O(n) nonzero entries This is an important assumption used later to ease computation Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

19 Problem: estimating the click volume We can more formally specify our problem now Problem Given an ad a, we wish to estimate Click volume(a), assuming: the distribution of requests follows that of the history table the ad a would be shown on all the requests Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

20 Outline 1 Background: Sponsored search 2 Formal description: the history table 3 Estimating click volume with linear regression 4 Efficient sparse linear regression Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

21 Approaches that don t work: nearest neighbour Nearest neighbour solutions We could try to find nearby ads and use their click volumes to estimate the new volume Problem # 1: Similarity of ads does not tell us anything about that of volumes Problem # 2: They are too slow! Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

22 Approaches that don t work: collaborative filtering Collaborative filtering If we view the requests as users, and ads as movies, this is similar to a prediction problem! Problem # 1: We have absolutely no data about the new movie, or ad in this case (the cold-start problem) Problem # 2: This ignores extra information, such as term similarity between ads and requests Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

23 Suggested approach The approach followed in the paper is three-fold: 1 Perform dimensionality reduction on the history table 2 Aggregate the values in the reduced table 3 Perform a least-squares regression We look at the steps in turn Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

24 Reduction of table We can convert the table into a matrix by store clickthrough rates for events, rather than storing separate click values for each event We can also create generalized events e = D(e) that reduce the dimensionality of events by some function D e.g. D might merge events with similar words Transform {(e 1, b 1 ), (e 2, b 2 ),..., (e k, b k )} ( e i, b ) i k where e = D(e 1 ) = D(e 2 ) =... = D(e k ) represents the equivalence class of e 1,..., e k Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

25 Reduced history table Dimensionality reduction and aggregation gives us a reduced history table We can think of it as being represented by a vector T of generalized events, and a vector β of click through rates for each event e 1 e 2 T =., β = β 2. e n β 1 β n Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

26 Estimating clickthrough rate Problem: Given a new generalized event e m, we want to estimate its clickthrough rate β m But this is equivalent to asking: is there a function f so that f(t ) β? T serves as our training set In our case, we assume that there is a linear relation between T, β This suggests a specific choice for f... Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

27 Estimating click volume Aim: We d like to find a vector α so that T α β This is a problem of linear regression There are more equations than variables, hence this is an overconstrained problem No exact solution in general, so we want one with minimal discrepancy Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

28 Formulating problem as logistic regression We need to address a technical point: regression works over [, ], whereas we want our β to be over [0, 1] Solution: We perform a logit transform to get a new vector γ = logit(β), which now takes values in [, ] Recall: The logit function is logit : x log x 1 x The equivalent problem is min T α γ Note: This is really logistic regression! Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

29 Solution summary 1 Find the equivalence classes of events, get a matrix T 2 Aggregate click values for equivalent events, get a vector β of clickthrough proportions 3 Find the vector α which minimizes T α γ 4 The predicted click volume is then simply CV (a) = i logit 1 (α.e (a, r i )) Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

30 Are we done? Intuitively, this method will work Question: But is it efficient? Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

31 Are we done? Intuitively, this method will work Question: But is it efficient? Answer: No! Standard regression techniques involves inverting a large, non-sparse matrix We need a new technique to efficiently solve this problem Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

32 Note on sampling requests Notice that we need to sum over all requests to estimate the click volume This is potentially an intractable operation Solution: Consider only a sample of requests, e.g. over a fixed time window Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

33 Outline 1 Background: Sponsored search 2 Formal description: the history table 3 Estimating click volume with linear regression 4 Efficient sparse linear regression Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

34 A geometric interpretation of T α If we write out the matrix product T α, it is clear that T 11 α T 1m α m T α =. T n1 α T nm α m = α 1 t α m t m So, T α is nothing but a projection of γ onto the span of the columns of T : T α span(t 1,..., t m ) Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

35 Geometric framework for our problem When we want T α γ to be minimal, we really want to find the optimal projection of γ onto the span of T s columns γ span(t 1,..., t m) γ = T α Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

36 Sequential solution We will look at a sequence of vectors {α (k) } k N, starting from the guess α (0) = 0 Idea: Solve this problem by focussing on one column at a time Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

37 Geometrically inspired update Update rule: Choose any j, and fix all columns but j; now seek the best projection of γ onto the jth column of T To do this, we project the current error in each step of our solution onto the column vector t j This is the same as projecting γ, by linearity Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

38 Visualizing the update Find the current error in our solution Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

39 Visualizing the update Project the error onto the vector t j Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

40 Visualizing the update Now repeat this operation Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

41 Recap: vector projection Recall that the optimal projection of vector b onto a is b proj a b = a.b a 2 a b a θ a.b a Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

42 Measuring how close we are to optimal Suppose that the discrepancy at step k is denoted by ϕ (k) := T α (k) γ Suppose also that the columns we choose are J = {j 1, j 2,...} We can derive the discrepancy at step (k + 1): ϕ (k) ϕ (k+1) = ϕ(k).t jk t jk 2 t j k This means we project the previous discrepancy onto the currently chosen column Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

43 Update rule There is a duality to the changes to our α vectors Our update for α works only on its jth component: α (k+1) j = α (k) j + ϕ(k).t j t j 2 Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

44 The algorithm while change in ϕ > ɛ Choose j [1, m] arbitrarily α j α j + ϕ.t j t j 2 ϕ ϕ ϕ.t j t j 2 t j Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

45 Convergence of the updates Our updates are intuitive, but how well do they work? We have the following guarantee [Convergence theorem] Suppose we have a set of columns J = {j 1, j 2,..., }, where every column appears infinitely many times. Then, the sequence ϕ (k), as defined above, will converge to ϕ, the minimal orthogonal projection error of γ onto the span of T Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

46 Convergence of the updates Intuitively, our updates should converge: e.g. in the case where we just have two columns Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

47 Proof outline The proof is in three steps 1 Show that ϕ (k) is bounded and monotone decreasing for every k 2 Assume that ϕ (k) does not converge to ϕ, but instead converges to ψ 3 Show that ϕ k does not converge to ψ Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

48 Bounded nature of ϕ (k) First, note that the vectors ϕ (k+1) and t jk are always orthogonal: ϕ (k+1).t jk = ϕ (k).t jk ϕ(k).t jk t jk 2 t j k.t jk = 0 Then, Pythagoras theorem tells us that 0 ϕ (k+1) ϕ (k) ϕ (0) ϕ (k) λtjk ϕ (k+1) Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

49 Towards a contradiction Since we know that ϕ (k) is bounded and monotone, it must converge [Monotone Convergence theorem] Say that ϕ (k converges to ψ ϕ Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

50 Towards a contradiction We now show that ψ cannot be a good enough bound on ϕ (k) At some stage, we know that ϕ (k) must come within distance c 2 of ϕ, for any constant c Now look at the potential updates 1 t jk ψ: this will mean we get closer to ψ 2 t jk ψ: this will mean the size of ϕ (k) dramatically reduces Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

51 Case 1: t jk ψ If t jk ψ, then t jk.ψ = 0 We have ϕ (k) ψ = (ϕ (k+1) ψ) + ϕ(k).t jk t jk 2 t j k But the terms on the RHS are orthogonal, which means that ϕ (k) ψ ϕ (k+1) ψ This means we cannot escape ψ, as we only get closer to it Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

52 Case 2: t jk ψ If t jk ψ, we can lower bound the shift by But ϕ (k) ϕ (k+1), so ϕ (k).t jk t jk t 2 j k = ψ.t j k + (ϕ (k) ψ).t jk t jk 2 c t j k c t 2 j k t jk 2 c 2 t j t j k t jk ϕ (k+1) 2 ϕ (k) 2 c2 4 t jk Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

53 The contradiction But these results lead to a contradiction We have said that 1 ϕ (k) visits the c/2 neighbourhood of ψ infintely often 2 Once it s inside the neighbourhood, j k ψ updates cannot let it escape 3 Once it s inside the neighbourhood, j k ψ updates cause a substantial decrease in the size of ϕ (k) But ϕ (k) is nonnegative and monotone, so this cannot be the case Contradiction! Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

54 Where are we now? We have shown that the given approach does converge to the optimal solution This is good, but there is another natural question to ask... Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

55 Runtime analysis How efficient is this approach? We show that we can exploit sparsity of the history table: [Runtime bound] We can perform each update j in time that is linear in the number of nonzero elements in the history table Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

56 How to do the update Suppose there are q j non-zero elements in t j We do the update as follows 1 Precompute all norms t j 2 Update α k j in O(q j) time: we just need to compute the dot-product of t j and ϕ (k) 3 Update ϕ k in O(q j ) time: again we just need to compute a dot-product Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

57 Note on convergence Important caveat: We don t have any guarantee on the convergence rate All we know is that it does converge Authors state this as an open problem More concerned with introducing a new idea into the literature Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

58 Outline 1 Background: Sponsored search 2 Formal description: the history table 3 Estimating click volume with linear regression 4 Efficient sparse linear regression Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

59 Conclusion The click volume is a quantity of interest in web advertising Given past history of clicks, we can estimate the click volume of a new ad using linear regression When the data is sparse, we need a new technique to efficiently compute the regression Paper provides an algorithm for efficiently solving this problem, with proof of correctness (but lacking a convergence rate bound) Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

60 Estimation of the Click Volume by Large Scale Regression Analysis May 15, / 50

Estimation of the Click Volume by Large Scale Regression Analysis

Estimation of the Click Volume by Large Scale Regression Analysis Estimation of the Click Volume by Large Scale Regression Analysis Yury Lifshits 1 and Dirk Nowotka 2 1 Steklov Institute of Mathematics St.Petersburg, Russia, yura@logic.pdmi.ras.ru 2 FMI, Universität

More information

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction MAT4 : Introduction to Applied Linear Algebra Mike Newman fall 7 9. Projections introduction One reason to consider projections is to understand approximate solutions to linear systems. A common example

More information

1 Review of the dot product

1 Review of the dot product Any typographical or other corrections about these notes are welcome. Review of the dot product The dot product on R n is an operation that takes two vectors and returns a number. It is defined by n u

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 Sathaye Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: For all v

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 F all203 Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: For all v,

More information

Linear Algebra March 16, 2019

Linear Algebra March 16, 2019 Linear Algebra March 16, 2019 2 Contents 0.1 Notation................................ 4 1 Systems of linear equations, and matrices 5 1.1 Systems of linear equations..................... 5 1.2 Augmented

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 F all206 Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: Commutativity:

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

From Non-Negative Matrix Factorization to Deep Learning

From Non-Negative Matrix Factorization to Deep Learning The Math!! From Non-Negative Matrix Factorization to Deep Learning Intuitions and some Math too! luissarmento@gmailcom https://wwwlinkedincom/in/luissarmento/ October 18, 2017 The Math!! Introduction Disclaimer

More information

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis

More information

Dot Products, Transposes, and Orthogonal Projections

Dot Products, Transposes, and Orthogonal Projections Dot Products, Transposes, and Orthogonal Projections David Jekel November 13, 2015 Properties of Dot Products Recall that the dot product or standard inner product on R n is given by x y = x 1 y 1 + +

More information

Projections and Least Square Solutions. Recall that given an inner product space V with subspace W and orthogonal basis for

Projections and Least Square Solutions. Recall that given an inner product space V with subspace W and orthogonal basis for Math 57 Spring 18 Projections and Least Square Solutions Recall that given an inner product space V with subspace W and orthogonal basis for W, B {v 1, v,..., v k }, the orthogonal projection of V onto

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Structured matrix factorizations. Example: Eigenfaces

Structured matrix factorizations. Example: Eigenfaces Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix

More information

CS276A Text Information Retrieval, Mining, and Exploitation. Lecture 4 15 Oct 2002

CS276A Text Information Retrieval, Mining, and Exploitation. Lecture 4 15 Oct 2002 CS276A Text Information Retrieval, Mining, and Exploitation Lecture 4 15 Oct 2002 Recap of last time Index size Index construction techniques Dynamic indices Real world considerations 2 Back of the envelope

More information

Linear Algebra, Summer 2011, pt. 2

Linear Algebra, Summer 2011, pt. 2 Linear Algebra, Summer 2, pt. 2 June 8, 2 Contents Inverses. 2 Vector Spaces. 3 2. Examples of vector spaces..................... 3 2.2 The column space......................... 6 2.3 The null space...........................

More information

Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression

Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression Due: Monday, February 13, 2017, at 10pm (Submit via Gradescope) Instructions: Your answers to the questions below,

More information

Math 416, Spring 2010 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 2010 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES

Math 416, Spring 2010 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 2010 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES Math 46, Spring 00 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 00 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES Recap Yesterday we talked about several new, important concepts

More information

MATH 114 Calculus Notes on Chapter 2 (Limits) (pages 60-? in Stewart)

MATH 114 Calculus Notes on Chapter 2 (Limits) (pages 60-? in Stewart) Still under construction. MATH 114 Calculus Notes on Chapter 2 (Limits) (pages 60-? in Stewart) As seen in A Preview of Calculus, the concept of it underlies the various branches of calculus. Hence we

More information

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2 Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch

More information

March 27 Math 3260 sec. 56 Spring 2018

March 27 Math 3260 sec. 56 Spring 2018 March 27 Math 3260 sec. 56 Spring 2018 Section 4.6: Rank Definition: The row space, denoted Row A, of an m n matrix A is the subspace of R n spanned by the rows of A. We now have three vector spaces associated

More information

7. Dimension and Structure.

7. Dimension and Structure. 7. Dimension and Structure 7.1. Basis and Dimension Bases for Subspaces Example 2 The standard unit vectors e 1, e 2,, e n are linearly independent, for if we write (2) in component form, then we obtain

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS 121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine

More information

Faster Johnson-Lindenstrauss style reductions

Faster Johnson-Lindenstrauss style reductions Faster Johnson-Lindenstrauss style reductions Aditya Menon August 23, 2007 Outline 1 Introduction Dimensionality reduction The Johnson-Lindenstrauss Lemma Speeding up computation 2 The Fast Johnson-Lindenstrauss

More information

Data Mining Recitation Notes Week 3

Data Mining Recitation Notes Week 3 Data Mining Recitation Notes Week 3 Jack Rae January 28, 2013 1 Information Retrieval Given a set of documents, pull the (k) most similar document(s) to a given query. 1.1 Setup Say we have D documents

More information

9 Searching the Internet with the SVD

9 Searching the Internet with the SVD 9 Searching the Internet with the SVD 9.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this

More information

12. Perturbed Matrices

12. Perturbed Matrices MAT334 : Applied Linear Algebra Mike Newman, winter 208 2. Perturbed Matrices motivation We want to solve a system Ax = b in a context where A and b are not known exactly. There might be experimental errors,

More information

13 Searching the Web with the SVD

13 Searching the Web with the SVD 13 Searching the Web with the SVD 13.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this

More information

MATH 221: SOLUTIONS TO SELECTED HOMEWORK PROBLEMS

MATH 221: SOLUTIONS TO SELECTED HOMEWORK PROBLEMS MATH 221: SOLUTIONS TO SELECTED HOMEWORK PROBLEMS 1. HW 1: Due September 4 1.1.21. Suppose v, w R n and c is a scalar. Prove that Span(v + cw, w) = Span(v, w). We must prove two things: that every element

More information

CSC321 Lecture 4 The Perceptron Algorithm

CSC321 Lecture 4 The Perceptron Algorithm CSC321 Lecture 4 The Perceptron Algorithm Roger Grosse and Nitish Srivastava January 17, 2017 Roger Grosse and Nitish Srivastava CSC321 Lecture 4 The Perceptron Algorithm January 17, 2017 1 / 1 Recap:

More information

Chapter 1: Linear Equations

Chapter 1: Linear Equations Chapter : Linear Equations (Last Updated: September, 6) The material for these notes is derived primarily from Linear Algebra and its applications by David Lay (4ed).. Systems of Linear Equations Before

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Inverse Functions. Say Thanks to the Authors Click (No sign in required)

Inverse Functions. Say Thanks to the Authors Click  (No sign in required) Inverse Functions Say Thanks to the Authors Click http://www.ck12.org/saythanks (No sign in required) To access a customizable version of this book, as well as other interactive content, visit www.ck12.org

More information

Leverage Sparse Information in Predictive Modeling

Leverage Sparse Information in Predictive Modeling Leverage Sparse Information in Predictive Modeling Liang Xie Countrywide Home Loans, Countrywide Bank, FSB August 29, 2008 Abstract This paper examines an innovative method to leverage information from

More information

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

Chapter 1: Linear Equations

Chapter 1: Linear Equations Chapter : Linear Equations (Last Updated: September, 7) The material for these notes is derived primarily from Linear Algebra and its applications by David Lay (4ed).. Systems of Linear Equations Before

More information

Answers in blue. If you have questions or spot an error, let me know. 1. Find all matrices that commute with A =. 4 3

Answers in blue. If you have questions or spot an error, let me know. 1. Find all matrices that commute with A =. 4 3 Answers in blue. If you have questions or spot an error, let me know. 3 4. Find all matrices that commute with A =. 4 3 a b If we set B = and set AB = BA, we see that 3a + 4b = 3a 4c, 4a + 3b = 3b 4d,

More information

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Exercise Sheet 1 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Show that an infinite mixture of Gaussian distributions, with Gamma distributions as mixing weights in the following

More information

Click Prediction and Preference Ranking of RSS Feeds

Click Prediction and Preference Ranking of RSS Feeds Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS

More information

Remark 3.2. The cross product only makes sense in R 3.

Remark 3.2. The cross product only makes sense in R 3. 3. Cross product Definition 3.1. Let v and w be two vectors in R 3. The cross product of v and w, denoted v w, is the vector defined as follows: the length of v w is the area of the parallelogram with

More information

Lecture 5: Web Searching using the SVD

Lecture 5: Web Searching using the SVD Lecture 5: Web Searching using the SVD Information Retrieval Over the last 2 years the number of internet users has grown exponentially with time; see Figure. Trying to extract information from this exponentially

More information

Collaborative Filtering Matrix Completion Alternating Least Squares

Collaborative Filtering Matrix Completion Alternating Least Squares Case Study 4: Collaborative Filtering Collaborative Filtering Matrix Completion Alternating Least Squares Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 19, 2016

More information

CS246 Final Exam, Winter 2011

CS246 Final Exam, Winter 2011 CS246 Final Exam, Winter 2011 1. Your name and student ID. Name:... Student ID:... 2. I agree to comply with Stanford Honor Code. Signature:... 3. There should be 17 numbered pages in this exam (including

More information

CSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides

CSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides CSE 494/598 Lecture-4: Correlation Analysis LYDIA MANIKONDA HT TP://WWW.PUBLIC.ASU.EDU/~LMANIKON / **Content adapted from last year s slides Announcements Project-1 Due: February 12 th 2016 Analysis report:

More information

Vector Spaces, Orthogonality, and Linear Least Squares

Vector Spaces, Orthogonality, and Linear Least Squares Week Vector Spaces, Orthogonality, and Linear Least Squares. Opening Remarks.. Visualizing Planes, Lines, and Solutions Consider the following system of linear equations from the opener for Week 9: χ χ

More information

LINEAR ALGEBRA: THEORY. Version: August 12,

LINEAR ALGEBRA: THEORY. Version: August 12, LINEAR ALGEBRA: THEORY. Version: August 12, 2000 13 2 Basic concepts We will assume that the following concepts are known: Vector, column vector, row vector, transpose. Recall that x is a column vector,

More information

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018 Lab 8: Measuring Graph Centrality - PageRank Monday, November 5 CompSci 531, Fall 2018 Outline Measuring Graph Centrality: Motivation Random Walks, Markov Chains, and Stationarity Distributions Google

More information

6.1. Inner Product, Length and Orthogonality

6.1. Inner Product, Length and Orthogonality These are brief notes for the lecture on Friday November 13, and Monday November 1, 2009: they are not complete, but they are a guide to what I want to say on those days. They are guaranteed to be incorrect..1.

More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information Introduction Consider a linear system y = Φx where Φ can be taken as an m n matrix acting on Euclidean space or more generally, a linear operator on a Hilbert space. We call the vector x a signal or input,

More information

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory Physics 202 Laboratory 5 Linear Algebra Laboratory 5 Physics 202 Laboratory We close our whirlwind tour of numerical methods by advertising some elements of (numerical) linear algebra. There are three

More information

All of my class notes can be found at

All of my class notes can be found at My name is Leon Hostetler I am currently a student at Florida State University majoring in physics as well as applied and computational mathematics Feel free to download, print, and use these class notes

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

Sequential Decision Problems

Sequential Decision Problems Sequential Decision Problems Michael A. Goodrich November 10, 2006 If I make changes to these notes after they are posted and if these changes are important (beyond cosmetic), the changes will highlighted

More information

Math 54 Homework 3 Solutions 9/

Math 54 Homework 3 Solutions 9/ Math 54 Homework 3 Solutions 9/4.8.8.2 0 0 3 3 0 0 3 6 2 9 3 0 0 3 0 0 3 a a/3 0 0 3 b b/3. c c/3 0 0 3.8.8 The number of rows of a matrix is the size (dimension) of the space it maps to; the number of

More information

Dot product. The dot product is an inner product on a coordinate vector space (Definition 1, Theorem

Dot product. The dot product is an inner product on a coordinate vector space (Definition 1, Theorem Dot product The dot product is an inner product on a coordinate vector space (Definition 1, Theorem 1). Definition 1 Given vectors v and u in n-dimensional space, the dot product is defined as, n v u v

More information

Extra Problems for Math 2050 Linear Algebra I

Extra Problems for Math 2050 Linear Algebra I Extra Problems for Math 5 Linear Algebra I Find the vector AB and illustrate with a picture if A = (,) and B = (,4) Find B, given A = (,4) and [ AB = A = (,4) and [ AB = 8 If possible, express x = 7 as

More information

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

Introduction to Algebra: The First Week

Introduction to Algebra: The First Week Introduction to Algebra: The First Week Background: According to the thermostat on the wall, the temperature in the classroom right now is 72 degrees Fahrenheit. I want to write to my friend in Europe,

More information

Span and Linear Independence

Span and Linear Independence Span and Linear Independence It is common to confuse span and linear independence, because although they are different concepts, they are related. To see their relationship, let s revisit the previous

More information

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 4: Regularization, Sparsity & Lasso

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 4: Regularization, Sparsity & Lasso Machine Learning Data Mining CS/CS/EE 155 Lecture 4: Regularization, Sparsity Lasso 1 Recap: Complete Pipeline S = {(x i, y i )} Training Data f (x, b) = T x b Model Class(es) L(a, b) = (a b) 2 Loss Function,b

More information

Chapter 6: Orthogonality

Chapter 6: Orthogonality Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products

More information

This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication.

This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication. This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication. Copyright Pearson Canada Inc. All rights reserved. Copyright Pearson

More information

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning: Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u 2 1 + u 2 2 = u 2. Geometric Meaning: u v = u v cos θ. u θ v 1 Reason: The opposite side is given by u v. u v 2 =

More information

Properties of Sequences

Properties of Sequences Properties of Sequences Here is a FITB proof arguing that a sequence cannot converge to two different numbers. The basic idea is to argue that if we assume this can happen, we deduce that something contradictory

More information

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017 Math 4A Notes Written by Victoria Kala vtkala@math.ucsb.edu Last updated June 11, 2017 Systems of Linear Equations A linear equation is an equation that can be written in the form a 1 x 1 + a 2 x 2 +...

More information

Multiclass Classification-1

Multiclass Classification-1 CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass

More information

Iterative solvers for linear equations

Iterative solvers for linear equations Spectral Graph Theory Lecture 15 Iterative solvers for linear equations Daniel A. Spielman October 1, 009 15.1 Overview In this and the next lecture, I will discuss iterative algorithms for solving linear

More information

Math 416, Spring 2010 Matrix multiplication; subspaces February 2, 2010 MATRIX MULTIPLICATION; SUBSPACES. 1. Announcements

Math 416, Spring 2010 Matrix multiplication; subspaces February 2, 2010 MATRIX MULTIPLICATION; SUBSPACES. 1. Announcements Math 416, Spring 010 Matrix multiplication; subspaces February, 010 MATRIX MULTIPLICATION; SUBSPACES 1 Announcements Office hours on Wednesday are cancelled because Andy will be out of town If you email

More information

DS-GA 1002 Lecture notes 12 Fall Linear regression

DS-GA 1002 Lecture notes 12 Fall Linear regression DS-GA Lecture notes 1 Fall 16 1 Linear models Linear regression In statistics, regression consists of learning a function relating a certain quantity of interest y, the response or dependent variable,

More information

Efficient Data Reduction and Summarization

Efficient Data Reduction and Summarization Ping Li Efficient Data Reduction and Summarization December 8, 2011 FODAVA Review 1 Efficient Data Reduction and Summarization Ping Li Department of Statistical Science Faculty of omputing and Information

More information

Dealing with Text Databases

Dealing with Text Databases Dealing with Text Databases Unstructured data Boolean queries Sparse matrix representation Inverted index Counts vs. frequencies Term frequency tf x idf term weights Documents as vectors Cosine similarity

More information

CS123 INTRODUCTION TO COMPUTER GRAPHICS. Linear Algebra /34

CS123 INTRODUCTION TO COMPUTER GRAPHICS. Linear Algebra /34 Linear Algebra /34 Vectors A vector is a magnitude and a direction Magnitude = v Direction Also known as norm, length Represented by unit vectors (vectors with a length of 1 that point along distinct axes)

More information

MATH 12 CLASS 2 NOTES, SEP Contents. 2. Dot product: determining the angle between two vectors 2

MATH 12 CLASS 2 NOTES, SEP Contents. 2. Dot product: determining the angle between two vectors 2 MATH 12 CLASS 2 NOTES, SEP 23 2011 Contents 1. Dot product: definition, basic properties 1 2. Dot product: determining the angle between two vectors 2 Quick links to definitions/theorems Dot product definition

More information

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn CMU 10-701: Machine Learning (Fall 2016) https://piazza.com/class/is95mzbrvpn63d OUT: September 13th DUE: September

More information

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem. Dot Products K. Behrend April 3, 008 Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem. Contents The dot product 3. Length of a vector........................

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

The Gram-Schmidt Process

The Gram-Schmidt Process The Gram-Schmidt Process How and Why it Works This is intended as a complement to 5.4 in our textbook. I assume you have read that section, so I will not repeat the definitions it gives. Our goal is to

More information

Chapter 4. Solving Systems of Equations. Chapter 4

Chapter 4. Solving Systems of Equations. Chapter 4 Solving Systems of Equations 3 Scenarios for Solutions There are three general situations we may find ourselves in when attempting to solve systems of equations: 1 The system could have one unique solution.

More information

MATH 2331 Linear Algebra. Section 1.1 Systems of Linear Equations. Finding the solution to a set of two equations in two variables: Example 1: Solve:

MATH 2331 Linear Algebra. Section 1.1 Systems of Linear Equations. Finding the solution to a set of two equations in two variables: Example 1: Solve: MATH 2331 Linear Algebra Section 1.1 Systems of Linear Equations Finding the solution to a set of two equations in two variables: Example 1: Solve: x x = 3 1 2 2x + 4x = 12 1 2 Geometric meaning: Do these

More information

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CS 6820 Fall 2014 Lectures, October 3-20, 2014 Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given

More information

Iterative solvers for linear equations

Iterative solvers for linear equations Spectral Graph Theory Lecture 23 Iterative solvers for linear equations Daniel A. Spielman November 26, 2018 23.1 Overview In this and the next lecture, I will discuss iterative algorithms for solving

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

MLISP: Machine Learning in Signal Processing Spring Lecture 8-9 May 4-7

MLISP: Machine Learning in Signal Processing Spring Lecture 8-9 May 4-7 MLISP: Machine Learning in Signal Processing Spring 2018 Prof. Veniamin Morgenshtern Lecture 8-9 May 4-7 Scribe: Mohamed Solomon Agenda 1. Wavelets: beyond smoothness 2. A problem with Fourier transform

More information

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min Spectral Graph Theory Lecture 2 The Laplacian Daniel A. Spielman September 4, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class. The notes written before

More information

Fixed Point Theorems

Fixed Point Theorems Fixed Point Theorems Definition: Let X be a set and let f : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent

Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent KDD 2011 Rainer Gemulla, Peter J. Haas, Erik Nijkamp and Yannis Sismanis Presenter: Jiawen Yao Dept. CSE, UT Arlington 1 1

More information

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 3

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 3 EECS 6A Designing Information Devices and Systems I Spring 209 Lecture Notes Note 3 3. Linear Dependence Recall the simple tomography example from Note, in which we tried to determine the composition of

More information

Algorithms for Nearest Neighbors

Algorithms for Nearest Neighbors Algorithms for Nearest Neighbors Background and Two Challenges Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura McGill University, July 2007 1 / 29 Outline

More information

CS123 INTRODUCTION TO COMPUTER GRAPHICS. Linear Algebra 1/33

CS123 INTRODUCTION TO COMPUTER GRAPHICS. Linear Algebra 1/33 Linear Algebra 1/33 Vectors A vector is a magnitude and a direction Magnitude = v Direction Also known as norm, length Represented by unit vectors (vectors with a length of 1 that point along distinct

More information

Section Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints.

Section Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints. Section Notes 8 Integer Programming II Applied Math 121 Week of April 5, 2010 Goals for the week understand IP relaxations be able to determine the relative strength of formulations understand the branch

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Popularity Recommendation Systems Predicting user responses to options Offering news articles based on users interests Offering suggestions on what the user might like to buy/consume

More information

Mathematical induction

Mathematical induction Mathematical induction Notes and Examples These notes contain subsections on Proof Proof by induction Types of proof by induction Proof You have probably already met the idea of proof in your study of

More information

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.] Math 43 Review Notes [Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty Dot Product If v (v, v, v 3 and w (w, w, w 3, then the

More information

Linear Independence Reading: Lay 1.7

Linear Independence Reading: Lay 1.7 Linear Independence Reading: Lay 17 September 11, 213 In this section, we discuss the concept of linear dependence and independence I am going to introduce the definitions and then work some examples and

More information

Inner product spaces. Layers of structure:

Inner product spaces. Layers of structure: Inner product spaces Layers of structure: vector space normed linear space inner product space The abstract definition of an inner product, which we will see very shortly, is simple (and by itself is pretty

More information