Where and how can linear algebra be useful in practice.

Similar documents
Graph Models The PageRank Algorithm

Link Analysis. Leonid E. Zhukov

A Note on Google s PageRank

Google Page Rank Project Linear Algebra Summer 2012

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Inf 2B: Ranking Queries on the WWW

Affine iterations on nonnegative vectors

How does Google rank webpages?

Numerical Methods I: Eigenvalues and eigenvectors

Data Mining Recitation Notes Week 3

Applications to network analysis: Eigenvector centrality indices Lecture notes

A New Method to Find the Eigenvalues of Convex. Matrices with Application in Web Page Rating

Page rank computation HPC course project a.y

1 The linear algebra of linear programs (March 15 and 22, 2015)

Math 304 Handout: Linear algebra, graphs, and networks.

0.1 Naive formulation of PageRank

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci

Multimedia Networking ECE 599

Lesson 10: Comparing Functions and their features

Uncertainty and Randomization

6.003: Signals and Systems. Sampling and Quantization

CS6220: DATA MINING TECHNIQUES

Google PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

1 Searching the World Wide Web

PageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10

Objective: Reduction of data redundancy. Coding redundancy Interpixel redundancy Psychovisual redundancy Fall LIST 2

IR: Information Retrieval

Justification and Application of Eigenvector Centrality

How works. or How linear algebra powers the search engine. M. Ram Murty, FRSC Queen s Research Chair Queen s University

Singular Value Decompsition

Lecture 7 Mathematics behind Internet Search

ECEN 689 Special Topics in Data Science for Communications Networks

Data and Algorithms of the Web

eigenvalues, markov matrices, and the power method

Calculating Web Page Authority Using the PageRank Algorithm

Quantized Principal Component Analysis with Applications to Low-Bandwidth Image Compression and Communication

1998: enter Link Analysis

Characteristics of Linear Functions (pp. 1 of 8)

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Today. Next lecture. (Ch 14) Markov chains and hidden Markov models

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

Linear Algebra for Machine Learning. Sargur N. Srihari

googling it: how google ranks search results Courtney R. Gibbons October 17, 2017

CS6220: DATA MINING TECHNIQUES

A hybrid reordered Arnoldi method to accelerate PageRank computations

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

L(3, 2, 1)-LABELING FOR CYLINDRICAL GRID: THE CARTESIAN PRODUCT OF A PATH AND A CYCLE. Byeong Moon Kim, Woonjae Hwang, and Byung Chul Song

SYDE 575: Introduction to Image Processing. Image Compression Part 2: Variable-rate compression

CS249: ADVANCED DATA MINING

CS246: Mining Massive Datasets Jure Leskovec, Stanford University.

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

Algebraic Representation of Networks

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Link Mining PageRank. From Stanford C246

Link Analysis Ranking

The Giving Game: Google Page Rank

Applications of The Perron-Frobenius Theorem

Announcements: Warm-up Exercise:

Image Data Compression

FALL 2011, LECTURE 1 (9/8/11) This is subject to revision. Current version: Thu, Sep 8, 1:00 PM

Application. Stochastic Matrices and PageRank

The Hypercube Graph and the Inhibitory Hypercube Network

Compressing a 1D Discrete Signal

Module 4. Multi-Resolution Analysis. Version 2 ECE IIT, Kharagpur

Christopher Rosenthal graduated in 2005 with a B.S. in computer

Krylov Subspace Methods to Calculate PageRank

Data Mining and Matrices

Image Compression. 1. Introduction. Greg Ames Dec 07, 2002

Computing PageRank using Power Extrapolation

The Google Markov Chain: convergence speed and eigenvalues

No class on Thursday, October 1. No office hours on Tuesday, September 29 and Thursday, October 1.

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

Node and Link Analysis

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Math 307 Learning Goals. March 23, 2010

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Compressing a 1D Discrete Signal

A Parallel Algorithm for Computing the Extremal Eigenvalues of Very Large Sparse Matrices*

Proyecto final de carrera

Robust Network Codes for Unicast Connections: A Case Study

ECS 289 / MAE 298, Network Theory Spring 2014 Problem Set # 1, Solutions. Problem 1: Power Law Degree Distributions (Continuum approach)

THEORY OF SEARCH ENGINES. CONTENTS 1. INTRODUCTION 1 2. RANKING OF PAGES 2 3. TWO EXAMPLES 4 4. CONCLUSION 5 References 5

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS

Pr[positive test virus] Pr[virus] Pr[positive test] = Pr[positive test] = Pr[positive test]

On the complexity of maximizing the minimum Shannon capacity in wireless networks by joint channel assignment and power allocation

Performance Analysis for Strong Interference Remove of Fast Moving Target in Linear Array Antenna

Function? c. {(-1,4);(0,-4);(1,-3);(-1,5);(2,-5)} {(-2,3);(-1,3);(0,1);(1,-3);(2,-5)} a. Domain Range Domain Range

MATH36001 Perron Frobenius Theory 2015

CSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides

PLC Papers Created For:

RLE = [ ; ], with compression ratio (CR) = 4/8. RLE actually increases the size of the compressed image.

ECE 661: Homework 10 Fall 2014

Link Analysis and Web Search

Chapter 2 Formulas and Definitions:

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256

Complete Sets of Hamiltonian Circuits for Classification of Documents

ELEC E7210: Communication Theory. Lecture 10: MIMO systems

Joint Optimization of Segmentation and Appearance Models

Transcription:

Where and how can linear algebra be useful in practice. Jiří Fiala All examples are simplified to demonstrate the use of tools of linear algebra. Real implementations are much more complex.

Systems of linear equations Weather forecast For weather forecasting the temperature, pressure, humidity and the speed and direction of airflow should be determined with sufficient accuracy based on the current weather reports from weather stations. Such situation could be described by a system of differential equations. Their exact solution is difficult and in some cases impossible.

Systems of linear equations For solving large systems of second order differential equations engineers use the finite element method. It it based on a division of the investigated space into small cells. On each cell the desired function is approximated by a linear combination of simple functions, e.g. piecewise linear. An approximation of a smooth function by a piecewise linear function. A decomposition of a piecewise linear function to a linear combination of simple functions. An example of a piecewise linear function on a triangulation in R 2

Systems of linear equations For modeling the airflow above the Central Europe, the space is divided by a three-dimensional grid of 530 420 87 points (with horizontal pitch about 4,7 km). The total number of grid points and hence the number of the corresponding cells, is almost 20 000 000. The resulting given system of differential equations is converted to a system of several hundreds millions of linear equations. Meteorologists at the ČHMÚ in Komořany need and can solve such a system. The weather forecast is updated every 6 hours according to the calculated model.

Basis change in a vector space Image compression in jpeg Jpeg (Joint photographic expert group) is a raster graphic format. Its principle is as follows: The image is first converted into the YCbCr color space. (YCbCr model was originally designed to transmit TV signals.) Each layer is cut into parts of size 8 8 pixels and each part is processed separately.

Basis change in a vector space Instead of encoding the intensity of 64 individual points the image of the entire piece is composed as a linear combination of 64 discrete harmonic functions above the 8 8 grid, which is the so-called discrete cosine Fourier transform Basis of individual points Basis of functions cos(ix) cos(jy) Small coefficients can be neglected, which leads the lossy compression. It has much better compression ratio than any lossless encoding, while the changes in the image could hardly be recognized by humans.

Eigenvalues Searching on Google Search for webpages by keywords is usually performed in two stages: 1. First, find all pages that contain those keywords. 2. Then sort this group by relevance. The calculation of the importance of pages in Google s search engine called the PageRank. was designed according to the Kendall Wei theory (±1950). Here the relevance of a page is directly proportional to the sum of relevances of pages that refer to it.

Eigenvalues Imagine that links between all web pages are encoded in a matrix M, where rows and columns correspond to single pages, s.t. { 1 if the i-th page refers on the j-th m i,j = 0 otherwise We get the so called adjacency matrix of the web. Relevances correspond to a nonnegative vector x, satisfying Mx = cx for a suitable positive constant c. Observe that x is the eigenvector of M corresponding to the eigenvalue c. It can be shown that with a minor modification of the matrix M, the eigenvalue c of the largest absolute value will be real and positive. Additionally, the eigenvector x corresponding to such eigenvalue has all its components positive.

Eigenvalues Some facts about the PageRank method. A draft of the method was designed by Larry Page and Sergey Brin. The project began in 1995, the first implementation has been tested in 1998. The vector x is updated continually, so that each component is recalculated once a month. The number of pages, i.e. the order the hypothetical matrix M in 2016 was of the magnitude of 4,75 billion. The matrix M is of course very sparse it has 7 links per page in average. In practice, the adjacency graph of the web is decomposed into smaller blocks, and even those are not represented by a matrix. The vector x is calculated by iterative approximate methods (25 80 iterations) with a fixed value of c = 0.85.

Integer linear programming Interference minimization in GSM networks A part of the wireless communication in GSM networks takes place between mobile phones and the co called BTS stations. A station is usually equipped with one or three antennas that serve the neighborhood of the station, called a cell. For the communication within the cell one or more frequencies are used, depending on the number of active GSM phones in the cell. On a single frequency, 6 8 phones can be served. Frequencies shall be allocated to transmitters so that no or only minimal interference appears. First, frequencies allocated to the transmitters of the same BTS should be treated. One shall also consider interference between the same or similar frequencies of close BTS s. The level of interference may depend on the surrounding terrain and other factors.

Integer linear programming A suitable plan for the frequency allocation can be obtained by integer linear programming methods. Binary variables x v f encode whether a transmitter v will be assigned frequency f. Further binary variables x v,v f,f which control the interference between two conflicting frequencies f and f at close transmitters v and v. The conditions of the ILP instance Ax = b guarantee that: each transmitter has available an appropriate number of frequencies variables x v,v f,f correctly determine if an interference appears. The objective function c T x is designed to minimize the number of positively evaluated variables x v,v f,f, i.e. to minimize the total interference.

Integer linear programming Realistic instances published on FAP web are of scale 40 80 frequences for 250 400 transmitters, i.e. 13 000 300 000 variables x v f 20 000 1 000 000 conditions, i.e. variables x v,v f,f Frequency plans were compiled for some scenarios where the interference dropped by 80 95 % w.r.t. to the plans used. Use one of these frequency plans led to the reduction of the overhead for the total handover of calls between cells by 20 %.

Further reading Finite element method, Jpeg, discrete cosine Fourier transform, Pagerank Wikipedia, Mathworld Google search engine structure Brin, Page: The anatomy of a large-scale hypertextual web search engine, Proc. WWW 7 (1998) 107 117 Frequency allocation in GSM networks Eisenblätter: Assigning frequencies in GSM networks Oper. Research Proc. 2002, Springer (2003) 33 40 Eisenblätter et al.: Frequency planning and ramifications of coloring, Disc. Math., Graph Th. 22(2002) 51 88