Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Size: px
Start display at page:

Download "Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39"

Transcription

1 Entropy Probability and Computing Presentation 22 Probability and Computing Presentation 22 Entropy 1/39

2 Introduction Why randomness and information are related? An event that is almost certain to occur (very high probability) carries almost no information when it happens. Is this news when the sun rises in the morning? An event that happens seldom (very low probability) is interesting and so informative. Probability and Computing Presentation 22 Entropy 2/39

3 Entropy We want to measure what intuitively is information / randomness. By analogy with physics, the entity that is to be measured will be called entropy. In thermodynamics, entropy is used as a measure of disorder of a physical system. Example: gas compressed has less entropy than this gas after dissipation. Shouldn t it have more entropy? It has more pressure! Probability and Computing Presentation 22 Entropy 3/39

4 Enter bits Consider a uniform probability distribution on a probability space with 2 n events. Postulate: each occurrence of such an event carries n units of entropy/information/randomness. Why: we need to concatenate n symbols in the simplest alphabet {0, 1} to create a message to describe the event: 2 n messages and 2 n events Probability and Computing Presentation 22 Entropy 4/39

5 Generalizing bits If p = 2 n is the probability of an elementary event, then n = lg 1 p. Postulate: occurrence of an event A, with Pr(A) > 0, brings this amount of information: lg 1 = lg Pr(A). Pr(A) Probability and Computing Presentation 22 Entropy 5/39

6 Random variables A random variable X : Ω R determines q partitioning of the sample space Ω into events of the form X = x, for all x in the range of values of X. Define the entropy of random variable X: H(X) = Pr(X = x) lg where x is taken over all values of X. 1 Pr(X = x), Probability and Computing Presentation 22 Entropy 6/39

7 Intuition for H(X) The entropy H(X) of a random variable X is the average number of bits needed to encode an outcome of an experiment that determines the value of X. Probability and Computing Presentation 22 Entropy 7/39

8 Entropy of a random variable X as expectation A random variable X with values in a countable set V X. Treat V X be a sample space with X s probability distribution: for r V X define Pr X (r) = Pr(X = r). Now define a random variable Y : V X R: Y (r) = lg 1 Pr X (r). Interpretation of entropy as expectation: H(X) = r V X Pr(X = r) lg 1 Pr(X = r) = r V X Pr X (r) Y (r) = E [Y ]. Probability and Computing Presentation 22 Entropy 8/39

9 Random variables with Bernoulli distributions Random variable X p with Bernoulli distribution, p probability of success. X p has entropy: H(X p ) = p lg 1 p + (1 p) lg 1 1 p. This is a function of p, denoted H(p), and called the (binary) entropy function: H(p) = p lg p (1 p) lg(1 p). Probability and Computing Presentation 22 Entropy 9/39

10 The essential properties of the entropy function It is continuos and converges to 0 when p 1 and when p 0. It is symmetric with respect to x = 1 2, for 0 < p 1 2 : H(p) = H(1 p). Entropy is increasing for 0 < p 1 2 and decreasing for 1 2 p < 1. Probability and Computing Presentation 22 Entropy 10/39

11 Bits Maximum of entropy function H is attained at H(1/2) which equals ( 1 H = 2) 1 2 lg lg 2 = 1. This represents the intuition: the outcome of toss of a fair coin gives a unit of information, called a bit. Probability and Computing Presentation 22 Entropy 11/39

12 Compression reflects information Example: take p = 1 4, then ( 1 H = 4) 1 4 lg lg < 1. How to interpret the property that H(1/4) < 1? What is troubling is that we cannot encode an outcome of a single experiment with fewer than one bits. Take a loooong sequence of some n outcomes of such experiments. Represent them by a corresponding sequence of n zeros and ones. This sequence could be compressed to fewer than n bits. Like.82 n bits? Probability and Computing Presentation 22 Entropy 12/39

13 A connection with combinatorics The entropy function occurs in combinatorial calculations when estimating a sum of consecutive binomial coefficients. Let 0 a 1 2 and n be a natural number. Then 0 k an ( ) n 2 H(a) n. k (see the lecture notes) Probability and Computing Presentation 22 Entropy 13/39

14 Realization of random variables Imagine that we carry out experiments to obtain a sequence x 1, x 2,... as a realization of X 1, X 2,..., where X i = x i according to the probability distribution of X i. We want to process this sequence of outcomes of experiments to produce a new sequence of values which is to be, with respect to its statistical properties, as if it were a realization of a sequence random variables Y 1, Y 2,.... This means simulating one sequence of random variables given a sequence of values taken on by a sequence of some other random variables. Probability and Computing Presentation 22 Entropy 14/39

15 Simulating a biased coin Suppose that X 1, X 2,... is a sequence of outcomes of independent tosses of a fair coin. We want to simulate a sequence on independent Bernoulli trials Y 1, Y 2,..., each with probability 0 < p < 1 of success, for a given p. This could be interpreted as simulating a biased coin using a fair coin. Probability and Computing Presentation 22 Entropy 15/39

16 An intuition of the simulation Suppose we can draw a random real number r from the interval (0, 1). Now if we draw r such that r p then this is a success and we output 1, and if we draw r > p then this is a failure and we output 0. Probability and Computing Presentation 22 Entropy 16/39

17 How to simulate drawing a random real number? We can represent such a number r by its binary expansion r = 0.r 1 r 2 r 3... where r i {0, 1} are bits. For example, 1 2 = , 1 4 = Random bits r 1 r 2 r 3... are taken from the sequence X 1, X 2,.... We use each one of these bits only once. Probability and Computing Presentation 22 Entropy 17/39

18 Towards a simulation Let p = 0.p 1 p 2 p 3... be a binary representation of p. Suppose we produce a randomly selected r = 0.r 1 r 2 r What does it mean that r p? This means that, for i being the first bit on which p and r differ: r i < p i. (simulation next) Probability and Computing Presentation 22 Entropy 18/39

19 The simulation Proceed bit by bit. Begin with comparing r 1 with p 1 : if r 1 < p 1 then this indicates success, and if r 1 > p 1 then this indicates failure, and if r 1 = p 1 then this bit does not help and we proceed to consider the second bit. This process continues until we find the first bit i such that r i p i. Probability and Computing Presentation 22 Entropy 19/39

20 Extracting randomness We discuss simulating Y 1, Y 2,... which is a sequence of outcomes of independent tosses of a fair coin. This is called extracting randomness from X 1, X 2,.... Probability and Computing Presentation 22 Entropy 20/39

21 Generalizing extraction Consider a procedure R that for a given X produces R(X), which is a string of bits. Given a sequence X 1, X 2,... of random variables, a new simulated sequence is defined to be R(X 1 ), R(X 2 ),.... We say that the bits of R(X i ) are produced simultaneously while the bits in R(X i ) with respect to R(X j ), for i j, are produced separately. For R to be an extractor it needs to have additional properties which we define next in two ways, then argue why they are equivalent. Probability and Computing Presentation 22 Entropy 21/39

22 One take on an extractor R We want the sequence obtained by concatenation R(X 1 ), R(X 2 ),... to be such that when it interpreted as a sequence of bits r 1, r 2,..., that is, r i {0, 1}, then the values r i are independent from each other and such that Pr(r i = 0) = Pr(r i = 1) = 1 2. Probability and Computing Presentation 22 Entropy 22/39

23 Another take on an extractor R It has two properties: Sequences produced separately are independent of each other. For any integer k > 0, if it is possible to extract k bits simultaneously, meaning that R(X) = (r 1,..., r k ) for some sequence of k bits (r 1,..., r k ), then, for any sequence (z 1,..., z k ) of k bits, the probability of extracting (z 1,..., z k ) as a value of R(X) is the same as the probability of extracting (r 1,..., r k ). Probability and Computing Presentation 22 Entropy 23/39

24 Equivalence of two definitions These two takes on extractor define the same concept. Probability and Computing Presentation 22 Entropy 24/39

25 An example of extracting randomness Experiment (realization) returns an integer selected uniformly at random from the interval [0, 7]. Fix a one-to-one correspondence between the integers in the interval [0, 7] and 8 strings of 3 bits each. Given a number in [0, 7], output the corresponding string. This is a clean situation as there are precisely binary strings, each of length = 8 I always liked 8, it is a nice number. Probability and Computing Presentation 22 Entropy 25/39

26 From 8 to 10 Suppose Y [0, 9] is an integer selected uniformly at random from among 10 integers. Since 10 = 8 + 2, we may assign a string of 3 bits to some 8 numbers in the interval, which leaves two integers in the interval. To them we may assign two different bits, bit per number. Sometimes we output three bits and sometimes just one bit? Probability and Computing Presentation 22 Entropy 26/39

27 From 10 to 11 What if Z [0, 10] is an integer selected uniformly at random from among 11 integers? Now 11 = = There is no apparent way to extend the construction for 10 integers, unless we modify the method significantly. Sometimes we output three bits, sometimes just one bit, and sometimes nothing at all? Probability and Computing Presentation 22 Entropy 27/39

28 Limits on extraction Fact: (upper bound on rate of extraction) No extraction function can produce more than H(X) bits simultaneously on average, if each random variable among X 1, X 2,... has the probability distribution of X. Probability and Computing Presentation 22 Entropy 28/39

29 Extracting randomness from a biased coin We want to process outcomes of tosses of a coin. Such outcomes of coin tosses represent a Bernoulli sequence, that is, a sequence of outcomes of independent Bernoulli trials. There is some probability p of heads coming up (which we call success) and q = 1 p is the probability of tails (or failure). The number p is not known: we want an extractor that works similarly for any 0 < p < 1 without having p as part of its code. This is obviously impossible!. Probability and Computing Presentation 22 Entropy 29/39

30 The main insight An input made of outcomes of experiments x 1, x 2, x 3,..., each either heads H or tails T, produced by independent tosses of a coin (we do not know if fair). Partition the input into pairs (x 1, x 2 ), (x 3, x 4 ), (x 5, x 6 ),.... There are four possibilities for a pair: HH, TT, HT, TH, which occur with the respective probabilities p 2, q 2, pq and qp. Probability and Computing Presentation 22 Entropy 30/39

31 We experience illumination The probability of HT is the same as of TH: pq = qp. Output T for each pair TH. Output H for each pair HT. Ignore pairs TT and HH. The guy who invented this simply knew God s phone number. Probability and Computing Presentation 22 Entropy 31/39

32 Efficiency of this extraction How many input bits do we use per one output bit on the average? Each pair HT and TH can be considered a success to produce output, so such success occurs with probability 2pq. The average waiting time for such a success is 1/(2pq) pairs, or twice as many 1/(pq) coin tosses. When p = q then four coin tosses are needed to produce one outcome bit on the average. It could be worse, as p(1 p) < 1 4 for 0 < p < 1 2. Example: For p = 1/3, the expected number of coin tosses per one outcome bit is 9 2 > 4. Probability and Computing Presentation 22 Entropy 32/39

33 A streamlined extraction procedure We partition the input, of consecutive realizations of the probability distribution of X, into consecutive pairs (x 1, x 2 ), (x 3, x 4 ), (x 5, x 6 ),.... We will be sending bits to three streams, one of them is the output, and the other two streams are denoted Y and Z. The bits making Y and Z are to have the property that they are outcomes of independent tosses of coins of some unknown biases, one such a coin for Y with its bias, and another coin for Z with its bias. Probability and Computing Presentation 22 Entropy 33/39

34 Three possible actions There are three actions to perform for a consecutive pair (x i, x i+1 ) of inputs. Some of these actions may be void. 1. Send to output: If (x i, x i+1 ) = HT then output H, and if (x i, x i+1 ) = TH then output T. 2. Send to Y : If (x i, x i+1 ) = HH then add H to Y, and if (x i, x i+1 ) = TT then add T to Y. 3. Send to Z: If either (x i, x i+1 ) = HH or (x i, x i+1 ) = TT then add H to Z, and if either (x i, x i+1 ) = HT or (x i, x i+1 ) = TH then add T to Z. Probability and Computing Presentation 22 Entropy 34/39

35 Final comments The streams Y and Z are to be processed recursively and the outcomes of this processing are to be interleaved with what we send directly to the output. The ultimate output bits, obtained from the direct output produced interleaved with those produced recursively from Y and Z, are to be independent and unbiased. As the process continues, the number of streams increases with no upper bound on the number of recursively processed streams. Probability and Computing Presentation 22 Entropy 35/39

36 Explaining the algorithm Let us look at the occurrences of tails in the three streams to be able to argue why they are independent. They can be obtained in the following three ways: 1 tails in the output come from input pairs of the form TH 2 tails in Y come from input pairs of the form TT 3 tails in Z come from input pairs of the form HT or TH The pairs TT are independent of the pairs TH or HT, as they are produced by different coin tosses, with no overlap. An occurrence of tails in the output is independent from those in Z: if tails in Z, then it could be produced by either HT, which adds H to the output, or TH, which adds T to the output. Probability and Computing Presentation 22 Entropy 36/39

37 Quality The extractor we described is optimal: it extracts all the randomness that is there. (see lecture notes for details) Probability and Computing Presentation 22 Entropy 37/39

38 Homework Your friend flips a fair coin repeatedly until the first heads occurs. Let X be a random variable equal to the number of flips. You want to determine how many flips were performed in a specific experiment. You are allowed to ask a series of yes no questions of the following form: you give your friend a set of integers and your friend answers yes if the number of flips is in the set and no otherwise. Probability and Computing Presentation 22 Entropy 38/39

39 Questions with hints 1 Give a formula for H(X). Hint: This is a specific random variable, apply the definition of entropy. 2 Describe a strategy such that the expected number of questions you ask before determining the number of flips is H(X). Hint: Find a strategy with a formula for the expected number of questions that looks the same as the formula for H(X). 3 Give an intuitive explanation of why there is no strategy that would allow to ask fewer than H(X) questions on average. Hint: Just verbalize your intuitions, referring to entropy. Probability and Computing Presentation 22 Entropy 39/39

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

Example 1. The sample space of an experiment where we flip a pair of coins is denoted by:

Example 1. The sample space of an experiment where we flip a pair of coins is denoted by: Chapter 8 Probability 8. Preliminaries Definition (Sample Space). A Sample Space, Ω, is the set of all possible outcomes of an experiment. Such a sample space is considered discrete if Ω has finite cardinality.

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

The devil is in the denominator

The devil is in the denominator Chapter 6 The devil is in the denominator 6. Too many coin flips Suppose we flip two coins. Each coin i is either fair (P r(h) = θ i = 0.5) or biased towards heads (P r(h) = θ i = 0.9) however, we cannot

More information

CMPSCI 240: Reasoning Under Uncertainty

CMPSCI 240: Reasoning Under Uncertainty CMPSCI 240: Reasoning Under Uncertainty Lecture 5 Prof. Hanna Wallach wallach@cs.umass.edu February 7, 2012 Reminders Pick up a copy of B&T Check the course website: http://www.cs.umass.edu/ ~wallach/courses/s12/cmpsci240/

More information

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018 Mathematical Foundations of Computer Science Lecture Outline October 18, 2018 The Total Probability Theorem. Consider events E and F. Consider a sample point ω E. Observe that ω belongs to either F or

More information

Discrete Structures for Computer Science

Discrete Structures for Computer Science Discrete Structures for Computer Science William Garrison bill@cs.pitt.edu 6311 Sennott Square Lecture #24: Probability Theory Based on materials developed by Dr. Adam Lee Not all events are equally likely

More information

Lecture 6: The Pigeonhole Principle and Probability Spaces

Lecture 6: The Pigeonhole Principle and Probability Spaces Lecture 6: The Pigeonhole Principle and Probability Spaces Anup Rao January 17, 2018 We discuss the pigeonhole principle and probability spaces. Pigeonhole Principle The pigeonhole principle is an extremely

More information

6.3 Bernoulli Trials Example Consider the following random experiments

6.3 Bernoulli Trials Example Consider the following random experiments 6.3 Bernoulli Trials Example 6.48. Consider the following random experiments (a) Flip a coin times. We are interested in the number of heads obtained. (b) Of all bits transmitted through a digital transmission

More information

Carleton University. Final Examination Fall DURATION: 2 HOURS No. of students: 195

Carleton University. Final Examination Fall DURATION: 2 HOURS No. of students: 195 Carleton University Final Examination Fall 15 DURATION: 2 HOURS No. of students: 195 Department Name & Course Number: Computer Science COMP 2804A Course Instructor: Michiel Smid Authorized memoranda: Calculator

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Spring 206 Rao and Walrand Note 6 Random Variables: Distribution and Expectation Example: Coin Flips Recall our setup of a probabilistic experiment as

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence

More information

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya BBM 205 Discrete Mathematics Hacettepe University http://web.cs.hacettepe.edu.tr/ bbm205 Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya Resources: Kenneth Rosen, Discrete

More information

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 202 Vazirani Note 4 Random Variables: Distribution and Expectation Random Variables Question: The homeworks of 20 students are collected in, randomly

More information

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

The probability of an event is viewed as a numerical measure of the chance that the event will occur. Chapter 5 This chapter introduces probability to quantify randomness. Section 5.1: How Can Probability Quantify Randomness? The probability of an event is viewed as a numerical measure of the chance that

More information

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( ) Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr Pr = Pr Pr Pr() Pr Pr. We are given three coins and are told that two of the coins are fair and the

More information

Lecture Notes. This lecture introduces the idea of a random variable. This name is a misnomer, since a random variable is actually a function.

Lecture Notes. This lecture introduces the idea of a random variable. This name is a misnomer, since a random variable is actually a function. Massachusetts Institute of Technology Lecture 21 6.042J/18.062J: Mathematics for Computer Science 25 April 2000 Professors David Karger and Nancy Lynch Lecture Notes 1 Random Variables This lecture introduces

More information

Random Variable. Pr(X = a) = Pr(s)

Random Variable. Pr(X = a) = Pr(s) Random Variable Definition A random variable X on a sample space Ω is a real-valued function on Ω; that is, X : Ω R. A discrete random variable is a random variable that takes on only a finite or countably

More information

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch Monty Hall Puzzle Example: You are asked to select one of the three doors to open. There is a large prize behind one of the doors and if you select that door, you win the prize. After you select a door,

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

Probabilistic models

Probabilistic models Kolmogorov (Andrei Nikolaevich, 1903 1987) put forward an axiomatic system for probability theory. Foundations of the Calculus of Probabilities, published in 1933, immediately became the definitive formulation

More information

6.02 Fall 2012 Lecture #1

6.02 Fall 2012 Lecture #1 6.02 Fall 2012 Lecture #1 Digital vs. analog communication The birth of modern digital communication Information and entropy Codes, Huffman coding 6.02 Fall 2012 Lecture 1, Slide #1 6.02 Fall 2012 Lecture

More information

Notes on Discrete Probability

Notes on Discrete Probability Columbia University Handout 3 W4231: Analysis of Algorithms September 21, 1999 Professor Luca Trevisan Notes on Discrete Probability The following notes cover, mostly without proofs, the basic notions

More information

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Please submit the solutions on Gradescope. Some definitions that may be useful: EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Definition 1: A sequence of random variables X

More information

God doesn t play dice. - Albert Einstein

God doesn t play dice. - Albert Einstein ECE 450 Lecture 1 God doesn t play dice. - Albert Einstein As far as the laws of mathematics refer to reality, they are not certain; as far as they are certain, they do not refer to reality. Lecture Overview

More information

X = X X n, + X 2

X = X X n, + X 2 CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk

More information

Guidelines for Solving Probability Problems

Guidelines for Solving Probability Problems Guidelines for Solving Probability Problems CS 1538: Introduction to Simulation 1 Steps for Problem Solving Suggested steps for approaching a problem: 1. Identify the distribution What distribution does

More information

Introduction to Randomized Algorithms: Quick Sort and Quick Selection

Introduction to Randomized Algorithms: Quick Sort and Quick Selection Chapter 14 Introduction to Randomized Algorithms: Quick Sort and Quick Selection CS 473: Fundamental Algorithms, Spring 2011 March 10, 2011 14.1 Introduction to Randomized Algorithms 14.2 Introduction

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

TT-FUNCTIONALS AND MARTIN-LÖF RANDOMNESS FOR BERNOULLI MEASURES

TT-FUNCTIONALS AND MARTIN-LÖF RANDOMNESS FOR BERNOULLI MEASURES TT-FUNCTIONALS AND MARTIN-LÖF RANDOMNESS FOR BERNOULLI MEASURES LOGAN AXON Abstract. For r [0, 1], the Bernoulli measure µ r on the Cantor space {0, 1} N assigns measure r to the set of sequences with

More information

Topics. Probability Theory. Perfect Secrecy. Information Theory

Topics. Probability Theory. Perfect Secrecy. Information Theory Topics Probability Theory Perfect Secrecy Information Theory Some Terms (P,C,K,E,D) Computational Security Computational effort required to break cryptosystem Provable Security Relative to another, difficult

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

k P (X = k)

k P (X = k) Math 224 Spring 208 Homework Drew Armstrong. Suppose that a fair coin is flipped 6 times in sequence and let X be the number of heads that show up. Draw Pascal s triangle down to the sixth row (recall

More information

Discrete Random Variable

Discrete Random Variable Discrete Random Variable Outcome of a random experiment need not to be a number. We are generally interested in some measurement or numerical attribute of the outcome, rather than the outcome itself. n

More information

27 Binary Arithmetic: An Application to Programming

27 Binary Arithmetic: An Application to Programming 27 Binary Arithmetic: An Application to Programming In the previous section we looked at the binomial distribution. The binomial distribution is essentially the mathematics of repeatedly flipping a coin

More information

Conditional Probability

Conditional Probability Conditional Probability Idea have performed a chance experiment but don t know the outcome (ω), but have some partial information (event A) about ω. Question: given this partial information what s the

More information

Probability and random variables

Probability and random variables Probability and random variables Events A simple event is the outcome of an experiment. For example, the experiment of tossing a coin twice has four possible outcomes: HH, HT, TH, TT. A compound event

More information

Set theory background for probability

Set theory background for probability Set theory background for probability Defining sets (a very naïve approach) A set is a collection of distinct objects. The objects within a set may be arbitrary, with the order of objects within them having

More information

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability What is Probability? the chance of an event occuring eg 1classical probability 2empirical probability 3subjective probability Section 2 - Probability (1) Probability - Terminology random (probability)

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Lecture 3. Discrete Random Variables

Lecture 3. Discrete Random Variables Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition

More information

Elementary Discrete Probability

Elementary Discrete Probability Elementary Discrete Probability MATH 472 Financial Mathematics J Robert Buchanan 2018 Objectives In this lesson we will learn: the terminology of elementary probability, elementary rules of probability,

More information

Econ 113. Lecture Module 2

Econ 113. Lecture Module 2 Econ 113 Lecture Module 2 Contents 1. Experiments and definitions 2. Events and probabilities 3. Assigning probabilities 4. Probability of complements 5. Conditional probability 6. Statistical independence

More information

An Entropy Bound for Random Number Generation

An Entropy Bound for Random Number Generation 244 An Entropy Bound for Random Number Generation Sung-il Pae, Hongik University, Seoul, Korea Summary Many computer applications use random numbers as an important computational resource, and they often

More information

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet. EE376A - Information Theory Final, Monday March 14th 216 Solutions Instructions: You have three hours, 3.3PM - 6.3PM The exam has 4 questions, totaling 12 points. Please start answering each question on

More information

Lecture 20 : Markov Chains

Lecture 20 : Markov Chains CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

Lecture 1: Shannon s Theorem

Lecture 1: Shannon s Theorem Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work

More information

EE5585 Data Compression May 2, Lecture 27

EE5585 Data Compression May 2, Lecture 27 EE5585 Data Compression May 2, 2013 Lecture 27 Instructor: Arya Mazumdar Scribe: Fangying Zhang Distributed Data Compression/Source Coding In the previous class we used a H-W table as a simple example,

More information

Chapter 2: Random Variables

Chapter 2: Random Variables ECE54: Stochastic Signals and Systems Fall 28 Lecture 2 - September 3, 28 Dr. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter 2: Random Variables Example. Tossing a fair coin twice:

More information

Probability and random variables. Sept 2018

Probability and random variables. Sept 2018 Probability and random variables Sept 2018 2 The sample space Consider an experiment with an uncertain outcome. The set of all possible outcomes is called the sample space. Example: I toss a coin twice,

More information

2. AXIOMATIC PROBABILITY

2. AXIOMATIC PROBABILITY IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

Statistics and Econometrics I

Statistics and Econometrics I Statistics and Econometrics I Random Variables Shiu-Sheng Chen Department of Economics National Taiwan University October 5, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I October 5, 2016

More information

Discrete Mathematics and Probability Theory Fall 2017 Ramchandran and Rao Midterm 2 Solutions

Discrete Mathematics and Probability Theory Fall 2017 Ramchandran and Rao Midterm 2 Solutions CS 70 Discrete Mathematics and Probability Theory Fall 2017 Ramchandran and Rao Midterm 2 Solutions PRINT Your Name: Oski Bear SIGN Your Name: OS K I PRINT Your Student ID: CIRCLE your exam room: Pimentel

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 CS 70 Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 Today we shall discuss a measure of how close a random variable tends to be to its expectation. But first we need to see how to compute

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Great Theoretical Ideas in Computer Science

Great Theoretical Ideas in Computer Science 15-251 Great Theoretical Ideas in Computer Science Probability Theory: Counting in Terms of Proportions Lecture 10 (September 27, 2007) Some Puzzles Teams A and B are equally good In any one game, each

More information

Carleton University. Final Examination Fall DURATION: 2 HOURS No. of students: 223

Carleton University. Final Examination Fall DURATION: 2 HOURS No. of students: 223 Carleton University Final Examination Fall 2016 DURATION: 2 HOURS No. of students: 223 Department Name & Course Number: Computer Science COMP 2804A Course Instructor: Michiel Smid Authorized memoranda:

More information

Intro to Information Theory

Intro to Information Theory Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here

More information

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B). Lectures 7-8 jacques@ucsdedu 41 Conditional Probability Let (Ω, F, P ) be a probability space Suppose that we have prior information which leads us to conclude that an event A F occurs Based on this information,

More information

1. If X has density. cx 3 e x ), 0 x < 0, otherwise. Find the value of c that makes f a probability density. f(x) =

1. If X has density. cx 3 e x ), 0 x < 0, otherwise. Find the value of c that makes f a probability density. f(x) = 1. If X has density f(x) = { cx 3 e x ), 0 x < 0, otherwise. Find the value of c that makes f a probability density. 2. Let X have density f(x) = { xe x, 0 < x < 0, otherwise. (a) Find P (X > 2). (b) Find

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

Math 180B Homework 4 Solutions

Math 180B Homework 4 Solutions Math 80B Homework 4 Solutions Note: We will make repeated use of the following result. Lemma. Let (X n ) be a time-homogeneous Markov chain with countable state space S, let A S, and let T = inf { n 0

More information

CSE525: Randomized Algorithms and Probabilistic Analysis April 2, Lecture 1

CSE525: Randomized Algorithms and Probabilistic Analysis April 2, Lecture 1 CSE525: Randomized Algorithms and Probabilistic Analysis April 2, 2013 Lecture 1 Lecturer: Anna Karlin Scribe: Sonya Alexandrova and Eric Lei 1 Introduction The main theme of this class is randomized algorithms.

More information

Classification & Information Theory Lecture #8

Classification & Information Theory Lecture #8 Classification & Information Theory Lecture #8 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 University of Massachusetts Amherst Andrew McCallum Today s Main Points Automatically categorizing

More information

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H. Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without

More information

Lecture 7: DecisionTrees

Lecture 7: DecisionTrees Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:

More information

Probabilistic models

Probabilistic models Probabilistic models Kolmogorov (Andrei Nikolaevich, 1903 1987) put forward an axiomatic system for probability theory. Foundations of the Calculus of Probabilities, published in 1933, immediately became

More information

1 Review of The Learning Setting

1 Review of The Learning Setting COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #8 Scribe: Changyan Wang February 28, 208 Review of The Learning Setting Last class, we moved beyond the PAC model: in the PAC model we

More information

02 Background Minimum background on probability. Random process

02 Background Minimum background on probability. Random process 0 Background 0.03 Minimum background on probability Random processes Probability Conditional probability Bayes theorem Random variables Sampling and estimation Variance, covariance and correlation Probability

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

Mutually Exclusive Events

Mutually Exclusive Events 172 CHAPTER 3 PROBABILITY TOPICS c. QS, 7D, 6D, KS Mutually Exclusive Events A and B are mutually exclusive events if they cannot occur at the same time. This means that A and B do not share any outcomes

More information

UNIT NUMBER PROBABILITY 6 (Statistics for the binomial distribution) A.J.Hobson

UNIT NUMBER PROBABILITY 6 (Statistics for the binomial distribution) A.J.Hobson JUST THE MATHS UNIT NUMBER 19.6 PROBABILITY 6 (Statistics for the binomial distribution) by A.J.Hobson 19.6.1 Construction of histograms 19.6.2 Mean and standard deviation of a binomial distribution 19.6.3

More information

n N CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.)

n N CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.) CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.) S.T. is the key to understanding driving forces. e.g., determines if a process proceeds spontaneously. Let s start with entropy

More information

P [(E and F )] P [F ]

P [(E and F )] P [F ] CONDITIONAL PROBABILITY AND INDEPENDENCE WORKSHEET MTH 1210 This worksheet supplements our textbook material on the concepts of conditional probability and independence. The exercises at the end of each

More information

Chapter 1: Introduction to Probability Theory

Chapter 1: Introduction to Probability Theory ECE5: Stochastic Signals and Systems Fall 8 Lecture - September 6, 8 Prof. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter : Introduction to Probability Theory Axioms of Probability

More information

Recursive Estimation

Recursive Estimation Recursive Estimation Raffaello D Andrea Spring 08 Problem Set : Bayes Theorem and Bayesian Tracking Last updated: March, 08 Notes: Notation: Unless otherwise noted, x, y, and z denote random variables,

More information

A Brief Review of Probability, Bayesian Statistics, and Information Theory

A Brief Review of Probability, Bayesian Statistics, and Information Theory A Brief Review of Probability, Bayesian Statistics, and Information Theory Brendan Frey Electrical and Computer Engineering University of Toronto frey@psi.toronto.edu http://www.psi.toronto.edu A system

More information

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability Lecture Notes 1 Basic Probability Set Theory Elements of Probability Conditional probability Sequential Calculation of Probability Total Probability and Bayes Rule Independence Counting EE 178/278A: Basic

More information

PERFECT SECRECY AND ADVERSARIAL INDISTINGUISHABILITY

PERFECT SECRECY AND ADVERSARIAL INDISTINGUISHABILITY PERFECT SECRECY AND ADVERSARIAL INDISTINGUISHABILITY BURTON ROSENBERG UNIVERSITY OF MIAMI Contents 1. Perfect Secrecy 1 1.1. A Perfectly Secret Cipher 2 1.2. Odds Ratio and Bias 3 1.3. Conditions for Perfect

More information

Entropy as a measure of surprise

Entropy as a measure of surprise Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify

More information

Some Basic Concepts of Probability and Information Theory: Pt. 2

Some Basic Concepts of Probability and Information Theory: Pt. 2 Some Basic Concepts of Probability and Information Theory: Pt. 2 PHYS 476Q - Southern Illinois University January 22, 2018 PHYS 476Q - Southern Illinois University Some Basic Concepts of Probability and

More information

Discrete Random Variables

Discrete Random Variables Chapter 5 Discrete Random Variables Suppose that an experiment and a sample space are given. A random variable is a real-valued function of the outcome of the experiment. In other words, the random variable

More information

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2 Probability Probability is the study of uncertain events or outcomes. Games of chance that involve rolling dice or dealing cards are one obvious area of application. However, probability models underlie

More information

Lecture Lecture 5

Lecture Lecture 5 Lecture 4 --- Lecture 5 A. Basic Concepts (4.1-4.2) 1. Experiment: A process of observing a phenomenon that has variation in its outcome. Examples: (E1). Rolling a die, (E2). Drawing a card form a shuffled

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Discrete Random Variables

Discrete Random Variables CPSC 53 Systems Modeling and Simulation Discrete Random Variables Dr. Anirban Mahanti Department of Computer Science University of Calgary mahanti@cpsc.ucalgary.ca Random Variables A random variable is

More information

Lecture 9: Conditional Probability and Independence

Lecture 9: Conditional Probability and Independence EE5110: Probability Foundations July-November 2015 Lecture 9: Conditional Probability and Independence Lecturer: Dr. Krishna Jagannathan Scribe: Vishakh Hegde 9.1 Conditional Probability Definition 9.1

More information

Expected Value 7/7/2006

Expected Value 7/7/2006 Expected Value 7/7/2006 Definition Let X be a numerically-valued discrete random variable with sample space Ω and distribution function m(x). The expected value E(X) is defined by E(X) = x Ω x m(x), provided

More information

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias Recap Announcements Lecture 5: Statistics 101 Mine Çetinkaya-Rundel September 13, 2011 HW1 due TA hours Thursday - Sunday 4pm - 9pm at Old Chem 211A If you added the class last week please make sure to

More information

6.042/18.062J Mathematics for Computer Science November 28, 2006 Tom Leighton and Ronitt Rubinfeld. Random Variables

6.042/18.062J Mathematics for Computer Science November 28, 2006 Tom Leighton and Ronitt Rubinfeld. Random Variables 6.042/18.062J Mathematics for Computer Science November 28, 2006 Tom Leighton and Ronitt Rubinfeld Lecture Notes Random Variables We ve used probablity to model a variety of experiments, games, and tests.

More information

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events LECTURE 1 1 Introduction The first part of our adventure is a highly selective review of probability theory, focusing especially on things that are most useful in statistics. 1.1 Sample spaces and events

More information

1. Discrete Distributions

1. Discrete Distributions Virtual Laboratories > 2. Distributions > 1 2 3 4 5 6 7 8 1. Discrete Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an underlying sample space Ω.

More information

Midterm Exam 1 Solution

Midterm Exam 1 Solution EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2015 Kannan Ramchandran September 22, 2015 Midterm Exam 1 Solution Last name First name SID Name of student on your left:

More information

MAT 271E Probability and Statistics

MAT 271E Probability and Statistics MAT 71E Probability and Statistics Spring 013 Instructor : Class Meets : Office Hours : Textbook : Supp. Text : İlker Bayram EEB 1103 ibayram@itu.edu.tr 13.30 1.30, Wednesday EEB 5303 10.00 1.00, Wednesday

More information

Bandits, Experts, and Games

Bandits, Experts, and Games Bandits, Experts, and Games CMSC 858G Fall 2016 University of Maryland Intro to Probability* Alex Slivkins Microsoft Research NYC * Many of the slides adopted from Ron Jin and Mohammad Hajiaghayi Outline

More information

2!3! (9)( 8x3 ) = 960x 3 ( 720x 3 ) = 1680x 3

2!3! (9)( 8x3 ) = 960x 3 ( 720x 3 ) = 1680x 3 CSCI 2200 Foundations of Computer Science Spring 2018 Quiz 2 (April 11, 2018) SOLUTIONS 1. [6 POINTS] An NBA team has nine players show up for a game, but only five players can be on the floor to play

More information