Stochastic Models in Computer Science A Tutorial

Similar documents
Chapter 2 Queueing Theory and Simulation

Probability and Statistics Concepts

Queueing Theory and Simulation. Introduction

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Basics of Stochastic Modeling: Part II

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Chapter 2: Random Variables

Northwestern University Department of Electrical Engineering and Computer Science

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Fundamental Tools - Probability Theory II

Performance Evaluation of Queuing Systems

Discrete Random Variables

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

1: PROBABILITY REVIEW

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Figure 10.1: Recording when the event E occurs

Chapter 2 Random Variables

Lecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf)

Continuous-Valued Probability Review

Name of the Student:

Things to remember when learning probability distributions:

Part I Stochastic variables and Markov chains

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Slides 8: Statistical Models in Simulation

MARKOV PROCESSES. Valerio Di Valerio

Exponential Distribution and Poisson Process

M/G/1 and M/G/1/K systems

Lecture 2: Repetition of probability theory and statistics

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

STAT509: Continuous Random Variable

CMPSCI 240: Reasoning Under Uncertainty

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, x. X s. Real Line

CSE 312, 2017 Winter, W.L. Ruzzo. 7. continuous random variables

Performance Modelling of Computer Systems

NICTA Short Course. Network Analysis. Vijay Sivaraman. Day 1 Queueing Systems and Markov Chains. Network Analysis, 2008s2 1-1

1.1 Review of Probability Theory

EE514A Information Theory I Fall 2013

EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2014 Kannan Ramchandran September 23, 2014.

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

n px p x (1 p) n x. p x n(n 1)... (n x + 1) x!

Lecture Notes 2 Random Variables. Random Variable

Random Variables and Their Distributions

Brief Review of Probability

Quick Tour of Basic Probability Theory and Linear Algebra

Analysis of Engineering and Scientific Data. Semester

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Chapter 4: Continuous Random Variable

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

ECE 313 Probability with Engineering Applications Fall 2000

Introduction to Stochastic Processes

ACM 116: Lectures 3 4

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models

Continuous-Time Markov Chain

Queueing Theory. VK Room: M Last updated: October 17, 2013.

Continuous Distributions

Engineering Mathematics : Probability & Queueing Theory SUBJECT CODE : MA 2262 X find the minimum value of c.

Lecture 1: Review on Probability and Statistics

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Part II: continuous time Markov chain (CTMC)

Lecture 3. Discrete Random Variables

1 Random Variable: Topics

Random Variables Example:

Chapter 4: Continuous Probability Distributions

Probability Density Functions and the Normal Distribution. Quantitative Understanding in Biology, 1.2

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Some Continuous Probability Distributions: Part I. Continuous Uniform distribution Normal Distribution. Exponential Distribution

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations

DISCRETE RANDOM VARIABLES: PMF s & CDF s [DEVORE 3.2]

Algorithms for Uncertainty Quantification

Chapter 3: Random Variables 1

Introduction to Queueing Theory

Suppose that you have three coins. Coin A is fair, coin B shows heads with probability 0.6 and coin C shows heads with probability 0.8.

STAT 430/510 Probability Lecture 12: Central Limit Theorem and Exponential Distribution

CDA5530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

GI/M/1 and GI/M/m queuing systems

Probability Distributions

Lecture 4a: Continuous-Time Markov Chain Models

STAT Chapter 5 Continuous Distributions

LECTURE #6 BIRTH-DEATH PROCESS

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

Bulk input queue M [X] /M/1 Bulk service queue M/M [Y] /1 Erlangian queue M/E k /1

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

CPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017

continuous random variables

BINOMIAL DISTRIBUTION

3 Continuous Random Variables

Chapter 5: Special Types of Queuing Models

Essentials on the Analysis of Randomized Algorithms

Continuous Random Variables and Continuous Distributions

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Continuous distributions

SDS 321: Introduction to Probability and Statistics

M378K In-Class Assignment #1

Probability reminders

Probability Theory and Random Variables

Transcription:

Stochastic Models in Computer Science A Tutorial Dr. Snehanshu Saha Department of Computer Science PESIT BSC, Bengaluru WCI 2015 - August 10 to August 13

1 Introduction 2 Random Variable 3 Introduction to Probability 4 Probability Distributions - The Motivation 5 Probability Distributions - Explained 6 Applications / Properties 7 Random Graphs 8 Inequalities 9 Queuing Theory 10 Markov Chains

Probability Models in Computer Science

Applications Machine Learning Randomized Algorithms Computer Graphics Medical Image Analysis Big Data Analytics Speech Recognition Systems Wireless Communication Communication Network

Randomness

What is a Random Variable?

Axioms of Probability Consider a Discrete Random Variable x 1, x 2, x 3,..., x n be the set of possible outcomes. 1 P(x i ) 0 ; i = 1, 2,..., n 2 i P(x i ) = 1

Coin Flipping - example A fair coin flipped twice Sample Space: {HH, HT, TH, TT } K : number of heads occurring K P(K ) 0 1 4 1 1 2 2 1 4

Birth of Statistics - The Battle of Plataea 5th Century BC, Plataea - a city owned by the Persians Defeated by a Greek alliance The wall of Plataea and its height PC: sikyon.com

Applications of several Distributions What is a Distribution? A Distribution is a Trend, a Dimension, a Fitting of data or just a Probability!!! Applications of Distributions Apartment Service Management Problem The Last Mile Postman (Poisson) Errors in Astronomical Observations (Gaussian)

Gaussian What do we observe about the Gaussian? Symmetry about the y-axis mean forms the axis of the symmetry maximum value is obtained at the mean Gaussian Distribution Given a random variable X, mean µ and standard deviation σ z = X µ σ

Binomial Distribution Length of the sequence of event: FIXED Out of each event: Exactly TWO (Success or Failure) Find the probability of obtaining a sequence with k successes and (n-k) failures Binomial Distribution P(K ) = ( n k )θk (1 θ) n k, k = 0, 1, 2,..., n Explain coin flipping problem in light of Binomial Distribution

Cumulative Distribution Function Properties of CDF (Continuous Random Variables) F x (x) = P(X x), < x < 1 0 F x (x) 1 2 F x (x 1 ) F x (x 2 ), if x 1 < x 2 3 lim x F x (x) = 1 4 P(a < x b) = F x (b) F x (a) 1 P(x > a) = 1 F x (a) 2 P(x < b) = F x (b) The CDF F x (x) is the area in the histogram up to x

Probability Mass Function Properties of PMF (Discrete Random Variables) 1 0 P(K ) 1 2 n i P(K i ) = 1 3 F x (x i ) F x (x i 1 ) = P(x x i ) P(x x i 1 ) = P(x = x i ) = p x (x) A Histogram is the graph of the Probability Mass Function. The total area under the graph = 1

Probability Density Function Continuous Random Variable and the PDF a) φ(x w i ) = A i e x a i b i + p(x w i)d x = 1 b) d(x) = p(x w 1) p(x w 2 )

Geometric Distribution Consider a sequence of Independent trails, each which is a success with a probability p, 0 < p < 1, or a failure with a probability 1 p. If X represents the trail # of the first success, then X is said to be a Geometric Random Variable having parameter p. Example Let N=# of packets transmitted until first success. P(N = n) = q n 1 (1 q), n = 1, 2, 3,... E (N = n) = n=1 nq n 1 (1 q) = 1 (1 q)

Cauchy Distribution Weird Distribution (Ex: DAX, German Stock Exchange) b π PDF f X (x) = ; symmetric about zero b 2 +x 2 Mean E (X ) = + xf X (x)dx = + Variance = + b π b 2 +x 2 dx bx 2 π b 2 +x 2 dx = undefined Variance is infinite

Exponential Distribution Properties f X (x) = λe λx, x 0 E (X ) = µ = + λxe λx dx = 1 λ 1 λ rate, As λ, E(X) 0 2 Var(X ) = E (X 2 ) (E (X )) 2 3 Expected time to arrive is inversely proportional to the arrival rate Stochastic equivalent of Laws of Motion

Poisson Distribution Properties A sequence of independent trials of a random experiment, the sample space of which has two outcomes, success and failure, is called a Poisson sequence of trials if the probability of success is not constant but varies from one trial to another. f X (x) = e λ. λ K! where λ is the only parameter of Poisson Distribution Limiting Case of Binomial Distribution f x (x) = ( n K )pk q (n K) Binomial Distribution Set p = λ n, λ > 0 p varies from one trial to another n, p 0 P( K successes) = ( n K )pk (1 p) n K e λ. λk K!

Poisson Distribution contd. Note Events where probability of success is small and the number of trials is large such that n p = λ is of moderate magnitude, can be modeled by P(K ) e λ. λk K! Poisson is the limiting case of Binomial Distribution Reproductive property of the Poisson Distribution

Memoryless Property P(X s + t X s) = P(X t) where X is the amount of time of waiting from a given point, say 0 Conditional Probability that the time to receive a call after s + t th amount of time = Prob (You received the call after the t th amount of time) Not dependent on the fact that you have already waited for an s th amount of time!!! No Memory of s Memoryless

Applications / Properties Jensen s Inequality For any convex function g(x ), and any random variable X; E [g(x )] g(e [X ]) Convex Function 0 α 1, any x 0 < x 1 g(αx 0 + (1 α)x 1 ) α.g(x 0 ) + (1 α).g(x 1 )... g is Convex Can we apply this property repeatedly? g( n i=0 α i x i n i=0 α i g(x i )... as long as x i < x i+1, ; i = 0, 1,...

Why Random Graphs? Telephone Internet Social Network Power Grid WSN MANET

Random Graph Undirected Graph: G (V, E ) where V : Set of Vertices and E : Set of Edges Path: A sequences of edges where {(i 0, i 1 ), (i 1, i 2 ),,..., (i k 1, i k ), (i k, j)} are distinct edges, is called a path from vertex i to vertex j

Preliminaries Clique Clique in an undirected graph is a subset of its vertices such that every two vertices in the subset are connected by an edge. Sub graphs in Social Networks denotes a set of people who know each other Maximum Clique Prob( clique of size K G(n, p)) ( n K )p(k 2 ) Independent Set Vertex Cover: Minimum number of vertices to cover all edges

Random Graphs Description Let G = (V, E );where V = {1, 2,..., n}and E = {(i, X (i))} n i=1 X (i) are independent Random Variables P{X (i) = j} = P j, n j=1 P j = 1 where (i, X (i)) an edge that starts from vertex i; i = 1, 2,..., n i.e. from each vertex, another vertex is randomly chosen and joined by an edge, according to the probabilities P j Probability Space Define a new Probability Space G(n, p) Ω = Ω E Sample Space is a string of length ( n 2 ) x : {E } {0, 1}= whether an ith edge belongs to the edge set Ω

Random Graphs - Problems Map Coloring Problem Influence Modeling Problem Bio-informatics Application Party Problem Community Clustering Problem

Ramsey Number With a reasonably large number of vertices n, you will always find a complete graph of r vertices; if not; an independent set of r vertices Ramsey Number= R(r) denotes how big the graph needs to be in order to get a clique or an independent set of r vertices R(1) = 1 ;R(2) = 2 R(3) = 6 i.e. with 6 vertices, you will find a complete graph of 3 vertices OR an independent set of 3 vertices What is R(r) = K? How small should the Ramsey number be, K min in R(r) = K? Bounds of Ramsey 2 r 2 < R(r) 2 2r 3 As long as R(r) = 2 r 2, it is impossible to find a graph which has either K clique or K independent set

Inequalities Markov s Inequality Gives an upper bound for the percent of distribution that is above a particular value What is the probability that the value of a R.V. X is far from its expectation? Definition If X is a non-negative R.V., then for any c > 0, then P{X c} E[X] c Example The average height of a kid is 4 ft. What is the probability of finding a kid whose height is greater than 6 ft? How about greater than 7 ft?

Application of Markov s Inequality in Delay Estimation Table: Avg. Delay per Job Request Figure: Average Delay Comparison

Application of Markov s Inequality in Delay Estimation contd. Table: Total Delay per Job Request Figure: Total Delay Comparison

Inequalities contd. Chebyshev s Inequality Let X be a R.V. (not necessarily non-negative), c > 0, then P( X E [X ] c) Var(X) c 2 Boole s Inequality P( n i=1 A i ) n i=1 P(A i ), where {A i } n i=1 set of events

Inequalities contd. Chernoff Bound φ(t) = E [e tx ]; X R.V. then for any c > 0 P{X c} e tc φ(t); t > 0 P{X c} e tc φ(t); t < 0 Jensen s Inequality of Expectations If f is a convex function E [f (x)] f (E [X ]) ; provided that the expectation exists

Queuing Theory - An Introduction λ: Arrival rate µ: Departure rate p n (t): State Transition Probabilities, i.e. at time t, there are n customers in the system p 1 (t + t) = λ t p 0 (t) + p 1 (t) λp 1 (t) µ t p 1 (t) + µ t p 2 (t) + 0( t)

Queuing Theory contd. Based on Memoryless property as t 0: p 0 (t + t) = (1 λ t)p 0 (t) + µ tp 1 (t) + 0( t) p 1 (t + t) = λ tp n 1 (t) + (1 (λ µ)) tp n (t) + µ tp n+1 (t) + 0( t)

Key terminologies Memoryless Property No need to remember when the last customer arrived Push the limit t 0, so that there is practically little or no difference between t and t + t Arrivals/Departures can t be tracked between time intervals Memoryless Steady State Probability p n=limt P{X(t)=n}, n=0,1,2,... Limiting or long-run probability that there will be exactly n customers in the system If p 0 = 0.3, then, in the long run, the system is empty of customers 30% of the time.

Key terminologies contd. Service Distribution (Ex: FCFS) Arrival one at a time, Depart one at a time Customers are serviced in the order they arrived Traffic Intensity Exponential inter-arrival times with mean = 1 λ times with mean = 1 µ ρ = λ µ < 1 with ρ < 1 and service Rate Equality State 0: rate at which process leaves = λp 0 and rate at which the process arrives = µp 1 Balance Equations (Job Flow) i.e. λp 0 = µp 1

Key terminologies contd. Limiting Behavior of the System Steady State Probabilities can be computed as: 1 0 = λp 0 + µp 1 2 0 = λp n 1 (λ + µ)p n + µp n+1 3 p n also satisfies n=0 p n = 1 Solutions to the System Prob( n jobs in the system): p n = (1 ρ)ρ n ; n = 0, 1, 2,... Expected # jobs in the system = L s ρ 1 ρ Expected # of jobs in the queue = L q = L s ρ Expected waiting time in the system W s = L s λ Expected waiting time in the queue W q = L q λ

Types of Queues M/M/1 Queue M: Memoryless arrival time (Poisson) M: Memoryless service time (Exponential) 1: One Server M/G/1 Queue M: Memoryless arrival time (Poisson) G: General service time 1: One Server

Markov Chains - An Introduction What is a Markov Chain? Mathematical model of a random phenomenon evolving with time such that the past affects the future only through the present Example P{X n+1 = j X n = i, X n 1 = i 1,..., X 0 = i 0 } = P i,j for all states i 0, i 1,..., i n 1, i, j and for all n 0 For a Markov Chain, the conditional distribution of any future state X n+1, given the past states X 0, X 1,..., X n 1 and the present state X n, is independent of the past states and depends only on the present state X n

Examples Mouse in a Cage Bank Account Simple random walk (Drunkard s Walk) Simple Random Walk (Drunkard s Walk) in a city Actuarial Chains

Chapman-Kolmogorov s Equations Defines n step transition probabilities P n ij to be the probability that a process in state i will be in state j after n additional transitions. That is: P n ij = P{X n+k = j X k = i}, n 0, i, j 0 P 1 ij = P ij n step transition probabilities p n+m ij = k=0 p n ik.pm kj for all n, m 0, all i, j

Thank you THANK YOU! Acknowledge Sara Punagin, M.Tech student of PESIT Bangalore South Campus, for help in preparing the slides.