Internet Measurement and Data Analysis (7)

Similar documents
Internet Measurement and Data Analysis (7)

Wireless Internet Exercises

Principal Components Analysis in 2D

communication networks

Maximum variance formulation

A Peak to the World of Multivariate Statistical Analysis

Reliable Data Transport: Sliding Windows

Internet Measurement and Data Analysis (4)

The Multivariate Gaussian Distribution

Principal Components Analysis (PCA)

Unsupervised Learning: Dimensionality Reduction

NICTA Short Course. Network Analysis. Vijay Sivaraman. Day 1 Queueing Systems and Markov Chains. Network Analysis, 2008s2 1-1

Linear Dimensionality Reduction

Exercises * on Principal Component Analysis

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Maximum Likelihood Estimation

Introduction to Machine Learning

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

EP2200 Course Project 2017 Project II - Mobile Computation Offloading

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA

The Principal Component Analysis

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Window Size. Window Size. Window Size. Time. Time. Time

Multivariate Statistical Analysis

A Mathematical Model of the Skype VoIP Congestion Control Algorithm

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

requests/sec. The total channel load is requests/sec. Using slot as the time unit, the total channel load is 50 ( ) = 1

Principal component analysis

1 Principal Components Analysis

COMPLEX PRINCIPAL COMPONENT SPECTRA EXTRACTION

Statistics-and-PCA. September 7, m = 1 n. x k. k=1. (x k m) 2. Var(x) = S 2 = 1 n 1. k=1

These are special traffic patterns that create more stress on a switch

Announcements (repeat) Principal Components Analysis

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

CSE 123: Computer Networks

Analysis of Scalable TCP in the presence of Markovian Losses

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Principal Component Analysis

Impact of Cross Traffic Burstiness on the Packet-scale Paradigm An Extended Analysis

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

GEOG 4110/5100 Advanced Remote Sensing Lecture 15

Computational functional genomics

Statistical View of Least Squares

Vectors and Matrices Statistics with Vectors and Matrices

Dynamic resource sharing

Covariance to PCA. CS 510 Lecture #8 February 17, 2014

Kalman filtering with intermittent heavy tailed observations

Course topics (tentative) The role of random effects

Principal Components Theory Notes

Multivariate probability distributions and linear regression

Chapter 5. The multivariate normal distribution. Probability Theory. Linear transformations. The mean vector and the covariance matrix

Advanced Machine Learning & Perception

Dimensionality Reduction. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

Performance Analysis of Priority Queueing Schemes in Internet Routers

8 Eigenvectors and the Anisotropic Multivariate Gaussian Distribution

PCA, Kernel PCA, ICA

Capacity management for packet-switched networks with heterogeneous sources. Linda de Jonge. Master Thesis July 29, 2009.

E8 TCP. Politecnico di Milano Scuola di Ingegneria Industriale e dell Informazione

cs/ee/ids 143 Communication Networks

PRINCIPAL COMPONENTS ANALYSIS

ON SPATIAL GOSSIP ALGORITHMS FOR AVERAGE CONSENSUS. Michael G. Rabbat

Response Surface Methodology III

STAT 501 Assignment 1 Name Spring Written Assignment: Due Monday, January 22, in class. Please write your answers on this assignment

Data Analysis and Statistical Methods Statistics 651

Lecture 5: Random Walks and Markov Chain

Eigenvalues, Eigenvectors, and an Intro to PCA

PRINCIPAL COMPONENT ANALYSIS

CS281 Section 4: Factor Analysis and PCA

Machine Learning 2nd Edition

Regression. Oscar García

Fairness comparison of FAST TCP and TCP Vegas

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

Sensor Tasking and Control

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

Chapter 2 Random Processes

Principal Component Analysis

Internet Congestion Control: Equilibrium and Dynamics

Modelling an Isolated Compound TCP Connection

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

Preprocessing & dimensionality reduction

Methodology for Computer Science Research Lecture 4: Mathematical Modeling

A Stochastic Model for TCP with Stationary Random Losses

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

Unit 9 Regression and Correlation Homework #14 (Unit 9 Regression and Correlation) SOLUTIONS. X = cigarette consumption (per capita in 1930)

STAT 100C: Linear models

Intelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham

Computer Vision Group Prof. Daniel Cremers. 3. Regression

What is Principal Component Analysis?

Electrical Engineering Written PhD Qualifier Exam Spring 2014

Processor Sharing Flows in the Internet

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

Colour. The visible spectrum of light corresponds wavelengths roughly from 400 to 700 nm.

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Congestion Control. Topics

Chapter 1. Linear Regression with One Predictor Variable

7 Principal Components and Factor Analysis

Covariance to PCA. CS 510 Lecture #14 February 23, 2018

Transcription:

Internet Measurement and Data Analysis (7) Kenjiro Cho 2012-11-14

review of previous class Class 6 Correlation (11/7) Online recommendation systems Distance Correlation coefficient exercise: correlation analysis 2 / 30

today s topics Class 7 Multivariate analysis Data sensing Linear regression Principal Component Analysis exercise: linear regression 3 / 30

multivariate analysis univariate analysis explores a single variable in a data set, separately multivariate analysis looks at more than one variables at a time enabled by computers finding hidden trends (data mining) 4 / 30

data sensing data sensing: collecting data from remote site it becomes possible to access various sensor information over the Internet weather information, power consumption, etc. 5 / 30

example: Internet vehicle experiment by WIDE Project in Nagoya in 2001 location, speed, and wiper usage data from 1,570 taxis blue areas indicate high ratio of wiper usage, showing rainfall in detail 6 / 30

Japan Earthquake the system is now part of ITS usable roads info released 3 days after the quake data provide by HONDA (TOYOTA, NISSAN) source: google crisis response 7 / 30

example: data center as data 8 / 30

measurement metrics of the Internet measurement metrics link capacity, throughput delay jitter packet loss rate methodologies active measurement: injects measurement packets (e.g., ping) passive measurement: monitors network without interfering in traffic monitor at 2 locations and compare infer from observations (e.g., behavior of TCP) collect measurements inside a transport mechanism 9 / 30

delay measurement delay components delay = propagation delay + queueing delay + other overhead if not congested, delay is close to propagation deley methods round-trip delay one-way delay requires clock synchronization average delay max delay: e.g., voice communication requires < 400ms jitter: variations in delay 10 / 30

some delay numbers packet transmission time (so called wire-speed) 1500 bytes at 10Mbps: 1.2msec 1500 bytes at 100Mbps: 120usec 1500 bytes at 1Gbps: 12usec 1500 bytes at 10Gbps: 1.2usec speed of light in fiber: about 200,000 km/s 100km round-trip: 1 msec 20,000km round-trip: 200msec satellite round-trip delay LEO (Low-Earth Orbit): 200 msec GEO (Geostationary Orbit): 600msec 11 / 30

packet loss measurement packet loss rate loss rate is enough if packet loss is random... in reality, bursty loss: e.g., buffer overflow packet size dependency: e.g., bit error rate in wireless transmission 12 / 30

pinger project the Internet End-to-end Performance Measurement (IEPM) project by SLAC using ping to measure rtt and packet loss around the world http://www-iepm.slac.stanford.edu/pinger/ started in 1995 over 600 sites in over 125 countries 13 / 30

pinger project monitoring sites monitoring (red), beacon (blue), remote (green) sites beacon sites are monitored by all monitors from pinger web site 14 / 30

pinger project monitoring sites in east asia monitoring (red) and remote (green) sites from pinger web site 15 / 30

pinger packet loss packet loss observed from N. Ameria exponential improvement in 10 years from pinger web site 16 / 30

pinger minimum rtt minimum rtts observed from N. America gradual shift from satellite to fiber in S. Asia and Africa from pinger web site 17 / 30

linear regression fitting a straight line to data least square method: minimize the sum of squared errors y 500 400 IPv6 response time (msec) 300 200 100 x v4/v6 rtts 9.28 + 1.03 * x 0 0 100 200 300 400 500 IPv4 response time (msec) 18 / 30

least square method a linear function minimizing squared errors f (x) = b 0 + b 1 x 2 regression parameters can be computed by xy n xȳ b 1 = x 2 n( x) 2 b 0 = ȳ b 1 x where x = 1 n n i=1 x i ȳ = 1 n n i=1 n n xy = x i y i x 2 = (x i ) 2 y i i=1 i=1 19 / 30

a derivation of the expressions for regression parameters The error in the ith observation: e i = y i (b 0 + b 1 x i ) For a sample of n observations, the mean error is ē = 1 X e i = 1 X (y i (b 0 + b 1 x i )) = ȳ b 0 b 1 x n n i i Setting the mean error to 0, we obtain: b 0 = ȳ b 1 x Substituting b 0 in the error expression: e i = y i ȳ + b 1 x b 1 x i = (y i ȳ) b 1 (x i x) The sum of squared errors, SSE, is SSE = nx ei 2 = i=1 nx [(y i ȳ) 2 2b 1 (y i ȳ)(x i x) + b1 2 (x i x) 2 ] i=1 SSE n = 1 nx (y i ȳ) 2 1 nx 2b 1 (y i ȳ)(x i x) + b 2 1 nx 1 (x i x) 2 n n n i=1 i=1 i=1 = σ 2 y 2b 1σ 2 xy + b2 1 σ2 x The value of b 1, which gives the minimum SSE, can be obtained by differentiating this equation with respect to b 1 and equating the result to 0: 1 d(sse) = 2σxy 2 + 2b 1 σx 2 = 0 n db 1 That is: b 1 = σ2 xy σ 2 x = P xy n xȳ P x 2 n( x) 2 20 / 30

principal component analysis; PCA purpose of PCA convert a set of possibly correlated variables into a smaller set of uncorrelated variables PCA can be solved by eigenvalue decomposition of a covariance matrix applications of PCA demensionality reduction sort principal components by contribution ratio, components with small contribution ratio can be ignored principal component labeling find means of produced principal components notes: PCA just extracts components with large variance not simple if axes are not in the same unit a convenient method to automatically analyze complex relationship, but it does not explain the complex relationship 21 / 30

PCA: intuitive explanation a view of cordinate transformation using a 2D graph draw the first axis (the 1st PCA axis) that goes through the centroid, along the direction of the maximal variability draw the 2nd axis that goes through the centroid, is orthogonal to the 1st axis, along the direction of the 2nd maximal variability draw the subsequent axes in the same manner For example, height and weight can be mapped to body size and slimness. we can add sitting height and chest measurement in a similar manner x2 y2 y1 x1 22 / 30

PCA (appendix) principal components can be found as the eigenvectors of a covariance matrix. let X be a d-demensional random variable. we want to find a dxd orthogonal transformation matrix P that convers X to its principal components Y. Y = P X solve this equation, assuming cov(y) being a diagonal matrix (components are independent), and P being an orthogonal matrix. (P 1 = P ) the covariance matrix of Y is thus, cov(y) = E[YY ] = E[(P X)(P X) ] = E[(P X)(X P)] = P E[XX ]P = P cov(x)p rewrite P as a dx1 matrix: Pcov(Y) = PP cov(x)p = cov(x)p P = [P 1, P 2,..., P d ] also, cov(y) is a diagonal matrix (components are independent) this can be rewritten as 2 3 λ 1 0 cov(y) = 6.. 4... 7.. 5 0 λ d [λ 1 P 1, λ 2 P 2,..., λ d P d ] = [cov(x)p 1, cov(x)p 2,..., cov(x)p d ] for λ i P i = cov(x)p i, P i is an eigenvector of the covariance matrix X thus, we can find a transformation matrix P by finding the eigenvectors. 23 / 30

y y previous exercise: computing correlation coefficient compute correlation coefficient using the sample data sets correlation-data-1.txt, correlation-data-2.txt correlation coefficient ρ xy = σ2 xy σ x σ y = P n i=1 x i y i (P n i=1 x i )( P n i=1 y i ) n q ( P n i=1 x2 i ( P ni=1 x i ) 2 )( P n n i=1 y i 2 ( P ni=1 y i ) 2 ) n 80 70 60 50 100 80 60 40 30 40 20 10 20 0 0 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 x x data-1:r=0.87 (left), data-2:r=-0.60 (right) 24 / 30

script to compute correlation coefficient #!/usr/bin/env ruby # regular expression for matching 2 floating numbers re = /([-+]?\d+(?:\.\d+)?)\s+([-+]?\d+(?:\.\d+)?)/ sum_x = 0.0 # sum of x sum_y = 0.0 # sum of y sum_xx = 0.0 # sum of x^2 sum_yy = 0.0 # sum of y^2 sum_xy = 0.0 # sum of xy n = 0 # the number of data ARGF.each_line do line if re.match(line) x = $1.to_f y = $2.to_f sum_x += x sum_y += y sum_xx += x**2 sum_yy += y**2 sum_xy += x * y n += 1 end end r = (sum_xy - sum_x * sum_y / n) / Math.sqrt((sum_xx - sum_x**2 / n) * (sum_yy - sum_y**2 / n)) printf "n:%d r:%.3f\n", n, r 25 / 30

y y today s exercise: linear regression linear regression by the least square method use the data for the previous exercise correlation-data-1.txt, correlation-data-2.txt f (x) = b 0 + b 1 x b 1 = P xy n xȳ P x 2 n( x) 2 b 0 = ȳ b 1 x 80 5.75 + 0.45 * x 100 72.72-0.38 * x 70 60 80 50 60 40 30 40 20 10 20 0 0 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 x x data-1:r=0.87 (left), data-2:r=-0.60 (right) 26 / 30

script for linear regression #!/usr/bin/env ruby # regular expression for matching 2 floating numbers re = /([-+]?\d+(?:\.\d+)?)\s+([-+]?\d+(?:\.\d+)?)/ sum_x = sum_y = sum_xx = sum_xy = 0.0 n = 0 ARGF.each_line do line if re.match(line) x = $1.to_f y = $2.to_f sum_x += x sum_y += y sum_xx += x**2 sum_xy += x * y n += 1 end end mean_x = Float(sum_x) / n mean_y = Float(sum_y) / n b1 = (sum_xy - n * mean_x * mean_y) / (sum_xx - n * mean_x**2) b0 = mean_y - b1 * mean_x printf "b0:%.3f b1:%.3f\n", b0, b1 27 / 30

adding the least squares line to scatter plot set xrange [0:160] set yrange [0:80] set xlabel "x" set ylabel "y" plot "correlation-data-1.txt" notitle with points, \ 5.75 + 0.45 * x lt 3 28 / 30

summary Class 7 Multivariate analysis Data sensing Linear regression Principal Component Analysis exercise: linear regression 29 / 30

next class Class 8 Time-series analysis (11/20) ***makeup class Nov 20 (Tue) 11:10-12:40 ɛ11 Internet and time Network Time Protocol Time series analysis exercise: time-series analysis assignment 2 30 / 30