Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39
|
|
- Rhoda Nelson
- 6 years ago
- Views:
Transcription
1 Entropy Probability and Computing Presentation 22 Probability and Computing Presentation 22 Entropy 1/39
2 Introduction Why randomness and information are related? An event that is almost certain to occur (very high probability) carries almost no information when it happens. Is this news when the sun rises in the morning? An event that happens seldom (very low probability) is interesting and so informative. Probability and Computing Presentation 22 Entropy 2/39
3 Entropy We want to measure what intuitively is information / randomness. By analogy with physics, the entity that is to be measured will be called entropy. In thermodynamics, entropy is used as a measure of disorder of a physical system. Example: gas compressed has less entropy than this gas after dissipation. Shouldn t it have more entropy? It has more pressure! Probability and Computing Presentation 22 Entropy 3/39
4 Enter bits Consider a uniform probability distribution on a probability space with 2 n events. Postulate: each occurrence of such an event carries n units of entropy/information/randomness. Why: we need to concatenate n symbols in the simplest alphabet {0, 1} to create a message to describe the event: 2 n messages and 2 n events Probability and Computing Presentation 22 Entropy 4/39
5 Generalizing bits If p = 2 n is the probability of an elementary event, then n = lg 1 p. Postulate: occurrence of an event A, with Pr(A) > 0, brings this amount of information: lg 1 = lg Pr(A). Pr(A) Probability and Computing Presentation 22 Entropy 5/39
6 Random variables A random variable X : Ω R determines q partitioning of the sample space Ω into events of the form X = x, for all x in the range of values of X. Define the entropy of random variable X: H(X) = Pr(X = x) lg where x is taken over all values of X. 1 Pr(X = x), Probability and Computing Presentation 22 Entropy 6/39
7 Intuition for H(X) The entropy H(X) of a random variable X is the average number of bits needed to encode an outcome of an experiment that determines the value of X. Probability and Computing Presentation 22 Entropy 7/39
8 Entropy of a random variable X as expectation A random variable X with values in a countable set V X. Treat V X be a sample space with X s probability distribution: for r V X define Pr X (r) = Pr(X = r). Now define a random variable Y : V X R: Y (r) = lg 1 Pr X (r). Interpretation of entropy as expectation: H(X) = r V X Pr(X = r) lg 1 Pr(X = r) = r V X Pr X (r) Y (r) = E [Y ]. Probability and Computing Presentation 22 Entropy 8/39
9 Random variables with Bernoulli distributions Random variable X p with Bernoulli distribution, p probability of success. X p has entropy: H(X p ) = p lg 1 p + (1 p) lg 1 1 p. This is a function of p, denoted H(p), and called the (binary) entropy function: H(p) = p lg p (1 p) lg(1 p). Probability and Computing Presentation 22 Entropy 9/39
10 The essential properties of the entropy function It is continuos and converges to 0 when p 1 and when p 0. It is symmetric with respect to x = 1 2, for 0 < p 1 2 : H(p) = H(1 p). Entropy is increasing for 0 < p 1 2 and decreasing for 1 2 p < 1. Probability and Computing Presentation 22 Entropy 10/39
11 Bits Maximum of entropy function H is attained at H(1/2) which equals ( 1 H = 2) 1 2 lg lg 2 = 1. This represents the intuition: the outcome of toss of a fair coin gives a unit of information, called a bit. Probability and Computing Presentation 22 Entropy 11/39
12 Compression reflects information Example: take p = 1 4, then ( 1 H = 4) 1 4 lg lg < 1. How to interpret the property that H(1/4) < 1? What is troubling is that we cannot encode an outcome of a single experiment with fewer than one bits. Take a loooong sequence of some n outcomes of such experiments. Represent them by a corresponding sequence of n zeros and ones. This sequence could be compressed to fewer than n bits. Like.82 n bits? Probability and Computing Presentation 22 Entropy 12/39
13 A connection with combinatorics The entropy function occurs in combinatorial calculations when estimating a sum of consecutive binomial coefficients. Let 0 a 1 2 and n be a natural number. Then 0 k an ( ) n 2 H(a) n. k (see the lecture notes) Probability and Computing Presentation 22 Entropy 13/39
14 Realization of random variables Imagine that we carry out experiments to obtain a sequence x 1, x 2,... as a realization of X 1, X 2,..., where X i = x i according to the probability distribution of X i. We want to process this sequence of outcomes of experiments to produce a new sequence of values which is to be, with respect to its statistical properties, as if it were a realization of a sequence random variables Y 1, Y 2,.... This means simulating one sequence of random variables given a sequence of values taken on by a sequence of some other random variables. Probability and Computing Presentation 22 Entropy 14/39
15 Simulating a biased coin Suppose that X 1, X 2,... is a sequence of outcomes of independent tosses of a fair coin. We want to simulate a sequence on independent Bernoulli trials Y 1, Y 2,..., each with probability 0 < p < 1 of success, for a given p. This could be interpreted as simulating a biased coin using a fair coin. Probability and Computing Presentation 22 Entropy 15/39
16 An intuition of the simulation Suppose we can draw a random real number r from the interval (0, 1). Now if we draw r such that r p then this is a success and we output 1, and if we draw r > p then this is a failure and we output 0. Probability and Computing Presentation 22 Entropy 16/39
17 How to simulate drawing a random real number? We can represent such a number r by its binary expansion r = 0.r 1 r 2 r 3... where r i {0, 1} are bits. For example, 1 2 = , 1 4 = Random bits r 1 r 2 r 3... are taken from the sequence X 1, X 2,.... We use each one of these bits only once. Probability and Computing Presentation 22 Entropy 17/39
18 Towards a simulation Let p = 0.p 1 p 2 p 3... be a binary representation of p. Suppose we produce a randomly selected r = 0.r 1 r 2 r What does it mean that r p? This means that, for i being the first bit on which p and r differ: r i < p i. (simulation next) Probability and Computing Presentation 22 Entropy 18/39
19 The simulation Proceed bit by bit. Begin with comparing r 1 with p 1 : if r 1 < p 1 then this indicates success, and if r 1 > p 1 then this indicates failure, and if r 1 = p 1 then this bit does not help and we proceed to consider the second bit. This process continues until we find the first bit i such that r i p i. Probability and Computing Presentation 22 Entropy 19/39
20 Extracting randomness We discuss simulating Y 1, Y 2,... which is a sequence of outcomes of independent tosses of a fair coin. This is called extracting randomness from X 1, X 2,.... Probability and Computing Presentation 22 Entropy 20/39
21 Generalizing extraction Consider a procedure R that for a given X produces R(X), which is a string of bits. Given a sequence X 1, X 2,... of random variables, a new simulated sequence is defined to be R(X 1 ), R(X 2 ),.... We say that the bits of R(X i ) are produced simultaneously while the bits in R(X i ) with respect to R(X j ), for i j, are produced separately. For R to be an extractor it needs to have additional properties which we define next in two ways, then argue why they are equivalent. Probability and Computing Presentation 22 Entropy 21/39
22 One take on an extractor R We want the sequence obtained by concatenation R(X 1 ), R(X 2 ),... to be such that when it interpreted as a sequence of bits r 1, r 2,..., that is, r i {0, 1}, then the values r i are independent from each other and such that Pr(r i = 0) = Pr(r i = 1) = 1 2. Probability and Computing Presentation 22 Entropy 22/39
23 Another take on an extractor R It has two properties: Sequences produced separately are independent of each other. For any integer k > 0, if it is possible to extract k bits simultaneously, meaning that R(X) = (r 1,..., r k ) for some sequence of k bits (r 1,..., r k ), then, for any sequence (z 1,..., z k ) of k bits, the probability of extracting (z 1,..., z k ) as a value of R(X) is the same as the probability of extracting (r 1,..., r k ). Probability and Computing Presentation 22 Entropy 23/39
24 Equivalence of two definitions These two takes on extractor define the same concept. Probability and Computing Presentation 22 Entropy 24/39
25 An example of extracting randomness Experiment (realization) returns an integer selected uniformly at random from the interval [0, 7]. Fix a one-to-one correspondence between the integers in the interval [0, 7] and 8 strings of 3 bits each. Given a number in [0, 7], output the corresponding string. This is a clean situation as there are precisely binary strings, each of length = 8 I always liked 8, it is a nice number. Probability and Computing Presentation 22 Entropy 25/39
26 From 8 to 10 Suppose Y [0, 9] is an integer selected uniformly at random from among 10 integers. Since 10 = 8 + 2, we may assign a string of 3 bits to some 8 numbers in the interval, which leaves two integers in the interval. To them we may assign two different bits, bit per number. Sometimes we output three bits and sometimes just one bit? Probability and Computing Presentation 22 Entropy 26/39
27 From 10 to 11 What if Z [0, 10] is an integer selected uniformly at random from among 11 integers? Now 11 = = There is no apparent way to extend the construction for 10 integers, unless we modify the method significantly. Sometimes we output three bits, sometimes just one bit, and sometimes nothing at all? Probability and Computing Presentation 22 Entropy 27/39
28 Limits on extraction Fact: (upper bound on rate of extraction) No extraction function can produce more than H(X) bits simultaneously on average, if each random variable among X 1, X 2,... has the probability distribution of X. Probability and Computing Presentation 22 Entropy 28/39
29 Extracting randomness from a biased coin We want to process outcomes of tosses of a coin. Such outcomes of coin tosses represent a Bernoulli sequence, that is, a sequence of outcomes of independent Bernoulli trials. There is some probability p of heads coming up (which we call success) and q = 1 p is the probability of tails (or failure). The number p is not known: we want an extractor that works similarly for any 0 < p < 1 without having p as part of its code. This is obviously impossible!. Probability and Computing Presentation 22 Entropy 29/39
30 The main insight An input made of outcomes of experiments x 1, x 2, x 3,..., each either heads H or tails T, produced by independent tosses of a coin (we do not know if fair). Partition the input into pairs (x 1, x 2 ), (x 3, x 4 ), (x 5, x 6 ),.... There are four possibilities for a pair: HH, TT, HT, TH, which occur with the respective probabilities p 2, q 2, pq and qp. Probability and Computing Presentation 22 Entropy 30/39
31 We experience illumination The probability of HT is the same as of TH: pq = qp. Output T for each pair TH. Output H for each pair HT. Ignore pairs TT and HH. The guy who invented this simply knew God s phone number. Probability and Computing Presentation 22 Entropy 31/39
32 Efficiency of this extraction How many input bits do we use per one output bit on the average? Each pair HT and TH can be considered a success to produce output, so such success occurs with probability 2pq. The average waiting time for such a success is 1/(2pq) pairs, or twice as many 1/(pq) coin tosses. When p = q then four coin tosses are needed to produce one outcome bit on the average. It could be worse, as p(1 p) < 1 4 for 0 < p < 1 2. Example: For p = 1/3, the expected number of coin tosses per one outcome bit is 9 2 > 4. Probability and Computing Presentation 22 Entropy 32/39
33 A streamlined extraction procedure We partition the input, of consecutive realizations of the probability distribution of X, into consecutive pairs (x 1, x 2 ), (x 3, x 4 ), (x 5, x 6 ),.... We will be sending bits to three streams, one of them is the output, and the other two streams are denoted Y and Z. The bits making Y and Z are to have the property that they are outcomes of independent tosses of coins of some unknown biases, one such a coin for Y with its bias, and another coin for Z with its bias. Probability and Computing Presentation 22 Entropy 33/39
34 Three possible actions There are three actions to perform for a consecutive pair (x i, x i+1 ) of inputs. Some of these actions may be void. 1. Send to output: If (x i, x i+1 ) = HT then output H, and if (x i, x i+1 ) = TH then output T. 2. Send to Y : If (x i, x i+1 ) = HH then add H to Y, and if (x i, x i+1 ) = TT then add T to Y. 3. Send to Z: If either (x i, x i+1 ) = HH or (x i, x i+1 ) = TT then add H to Z, and if either (x i, x i+1 ) = HT or (x i, x i+1 ) = TH then add T to Z. Probability and Computing Presentation 22 Entropy 34/39
35 Final comments The streams Y and Z are to be processed recursively and the outcomes of this processing are to be interleaved with what we send directly to the output. The ultimate output bits, obtained from the direct output produced interleaved with those produced recursively from Y and Z, are to be independent and unbiased. As the process continues, the number of streams increases with no upper bound on the number of recursively processed streams. Probability and Computing Presentation 22 Entropy 35/39
36 Explaining the algorithm Let us look at the occurrences of tails in the three streams to be able to argue why they are independent. They can be obtained in the following three ways: 1 tails in the output come from input pairs of the form TH 2 tails in Y come from input pairs of the form TT 3 tails in Z come from input pairs of the form HT or TH The pairs TT are independent of the pairs TH or HT, as they are produced by different coin tosses, with no overlap. An occurrence of tails in the output is independent from those in Z: if tails in Z, then it could be produced by either HT, which adds H to the output, or TH, which adds T to the output. Probability and Computing Presentation 22 Entropy 36/39
37 Quality The extractor we described is optimal: it extracts all the randomness that is there. (see lecture notes for details) Probability and Computing Presentation 22 Entropy 37/39
38 Homework Your friend flips a fair coin repeatedly until the first heads occurs. Let X be a random variable equal to the number of flips. You want to determine how many flips were performed in a specific experiment. You are allowed to ask a series of yes no questions of the following form: you give your friend a set of integers and your friend answers yes if the number of flips is in the set and no otherwise. Probability and Computing Presentation 22 Entropy 38/39
39 Questions with hints 1 Give a formula for H(X). Hint: This is a specific random variable, apply the definition of entropy. 2 Describe a strategy such that the expected number of questions you ask before determining the number of flips is H(X). Hint: Find a strategy with a formula for the expected number of questions that looks the same as the formula for H(X). 3 Give an intuitive explanation of why there is no strategy that would allow to ask fewer than H(X) questions on average. Hint: Just verbalize your intuitions, referring to entropy. Probability and Computing Presentation 22 Entropy 39/39
Lecture 4: Probability and Discrete Random Variables
Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1
More informationExample 1. The sample space of an experiment where we flip a pair of coins is denoted by:
Chapter 8 Probability 8. Preliminaries Definition (Sample Space). A Sample Space, Ω, is the set of all possible outcomes of an experiment. Such a sample space is considered discrete if Ω has finite cardinality.
More informationEE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018
Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code
More informationThe devil is in the denominator
Chapter 6 The devil is in the denominator 6. Too many coin flips Suppose we flip two coins. Each coin i is either fair (P r(h) = θ i = 0.5) or biased towards heads (P r(h) = θ i = 0.9) however, we cannot
More informationCMPSCI 240: Reasoning Under Uncertainty
CMPSCI 240: Reasoning Under Uncertainty Lecture 5 Prof. Hanna Wallach wallach@cs.umass.edu February 7, 2012 Reminders Pick up a copy of B&T Check the course website: http://www.cs.umass.edu/ ~wallach/courses/s12/cmpsci240/
More informationMathematical Foundations of Computer Science Lecture Outline October 18, 2018
Mathematical Foundations of Computer Science Lecture Outline October 18, 2018 The Total Probability Theorem. Consider events E and F. Consider a sample point ω E. Observe that ω belongs to either F or
More informationDiscrete Structures for Computer Science
Discrete Structures for Computer Science William Garrison bill@cs.pitt.edu 6311 Sennott Square Lecture #24: Probability Theory Based on materials developed by Dr. Adam Lee Not all events are equally likely
More informationLecture 6: The Pigeonhole Principle and Probability Spaces
Lecture 6: The Pigeonhole Principle and Probability Spaces Anup Rao January 17, 2018 We discuss the pigeonhole principle and probability spaces. Pigeonhole Principle The pigeonhole principle is an extremely
More information6.3 Bernoulli Trials Example Consider the following random experiments
6.3 Bernoulli Trials Example 6.48. Consider the following random experiments (a) Flip a coin times. We are interested in the number of heads obtained. (b) Of all bits transmitted through a digital transmission
More informationCarleton University. Final Examination Fall DURATION: 2 HOURS No. of students: 195
Carleton University Final Examination Fall 15 DURATION: 2 HOURS No. of students: 195 Department Name & Course Number: Computer Science COMP 2804A Course Instructor: Michiel Smid Authorized memoranda: Calculator
More informationDiscrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation
CS 70 Discrete Mathematics and Probability Theory Spring 206 Rao and Walrand Note 6 Random Variables: Distribution and Expectation Example: Coin Flips Recall our setup of a probabilistic experiment as
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10
EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped
More informationDiscrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation
CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence
More informationLecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya
BBM 205 Discrete Mathematics Hacettepe University http://web.cs.hacettepe.edu.tr/ bbm205 Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya Resources: Kenneth Rosen, Discrete
More informationDiscrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation
CS 70 Discrete Mathematics and Probability Theory Fall 202 Vazirani Note 4 Random Variables: Distribution and Expectation Random Variables Question: The homeworks of 20 students are collected in, randomly
More informationThe probability of an event is viewed as a numerical measure of the chance that the event will occur.
Chapter 5 This chapter introduces probability to quantify randomness. Section 5.1: How Can Probability Quantify Randomness? The probability of an event is viewed as a numerical measure of the chance that
More informationTheorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )
Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr Pr = Pr Pr Pr() Pr Pr. We are given three coins and are told that two of the coins are fair and the
More informationLecture Notes. This lecture introduces the idea of a random variable. This name is a misnomer, since a random variable is actually a function.
Massachusetts Institute of Technology Lecture 21 6.042J/18.062J: Mathematics for Computer Science 25 April 2000 Professors David Karger and Nancy Lynch Lecture Notes 1 Random Variables This lecture introduces
More informationRandom Variable. Pr(X = a) = Pr(s)
Random Variable Definition A random variable X on a sample space Ω is a real-valued function on Ω; that is, X : Ω R. A discrete random variable is a random variable that takes on only a finite or countably
More informationMonty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch
Monty Hall Puzzle Example: You are asked to select one of the three doors to open. There is a large prize behind one of the doors and if you select that door, you win the prize. After you select a door,
More informationDiscrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations
EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of
More informationExercises with solutions (Set B)
Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th
More informationProbabilistic models
Kolmogorov (Andrei Nikolaevich, 1903 1987) put forward an axiomatic system for probability theory. Foundations of the Calculus of Probabilities, published in 1933, immediately became the definitive formulation
More information6.02 Fall 2012 Lecture #1
6.02 Fall 2012 Lecture #1 Digital vs. analog communication The birth of modern digital communication Information and entropy Codes, Huffman coding 6.02 Fall 2012 Lecture 1, Slide #1 6.02 Fall 2012 Lecture
More informationNotes on Discrete Probability
Columbia University Handout 3 W4231: Analysis of Algorithms September 21, 1999 Professor Luca Trevisan Notes on Discrete Probability The following notes cover, mostly without proofs, the basic notions
More informationEE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018
Please submit the solutions on Gradescope. Some definitions that may be useful: EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Definition 1: A sequence of random variables X
More informationGod doesn t play dice. - Albert Einstein
ECE 450 Lecture 1 God doesn t play dice. - Albert Einstein As far as the laws of mathematics refer to reality, they are not certain; as far as they are certain, they do not refer to reality. Lecture Overview
More informationX = X X n, + X 2
CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk
More informationGuidelines for Solving Probability Problems
Guidelines for Solving Probability Problems CS 1538: Introduction to Simulation 1 Steps for Problem Solving Suggested steps for approaching a problem: 1. Identify the distribution What distribution does
More informationIntroduction to Randomized Algorithms: Quick Sort and Quick Selection
Chapter 14 Introduction to Randomized Algorithms: Quick Sort and Quick Selection CS 473: Fundamental Algorithms, Spring 2011 March 10, 2011 14.1 Introduction to Randomized Algorithms 14.2 Introduction
More informationDiscrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14
CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten
More informationTT-FUNCTIONALS AND MARTIN-LÖF RANDOMNESS FOR BERNOULLI MEASURES
TT-FUNCTIONALS AND MARTIN-LÖF RANDOMNESS FOR BERNOULLI MEASURES LOGAN AXON Abstract. For r [0, 1], the Bernoulli measure µ r on the Cantor space {0, 1} N assigns measure r to the set of sequences with
More informationTopics. Probability Theory. Perfect Secrecy. Information Theory
Topics Probability Theory Perfect Secrecy Information Theory Some Terms (P,C,K,E,D) Computational Security Computational effort required to break cryptosystem Provable Security Relative to another, difficult
More informationLecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019
Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial
More informationk P (X = k)
Math 224 Spring 208 Homework Drew Armstrong. Suppose that a fair coin is flipped 6 times in sequence and let X be the number of heads that show up. Draw Pascal s triangle down to the sixth row (recall
More informationDiscrete Random Variable
Discrete Random Variable Outcome of a random experiment need not to be a number. We are generally interested in some measurement or numerical attribute of the outcome, rather than the outcome itself. n
More information27 Binary Arithmetic: An Application to Programming
27 Binary Arithmetic: An Application to Programming In the previous section we looked at the binomial distribution. The binomial distribution is essentially the mathematics of repeatedly flipping a coin
More informationConditional Probability
Conditional Probability Idea have performed a chance experiment but don t know the outcome (ω), but have some partial information (event A) about ω. Question: given this partial information what s the
More informationProbability and random variables
Probability and random variables Events A simple event is the outcome of an experiment. For example, the experiment of tossing a coin twice has four possible outcomes: HH, HT, TH, TT. A compound event
More informationSet theory background for probability
Set theory background for probability Defining sets (a very naïve approach) A set is a collection of distinct objects. The objects within a set may be arbitrary, with the order of objects within them having
More informationI - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability
What is Probability? the chance of an event occuring eg 1classical probability 2empirical probability 3subjective probability Section 2 - Probability (1) Probability - Terminology random (probability)
More informationProbability. Lecture Notes. Adolfo J. Rumbos
Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................
More informationLecture 3. Discrete Random Variables
Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition
More informationElementary Discrete Probability
Elementary Discrete Probability MATH 472 Financial Mathematics J Robert Buchanan 2018 Objectives In this lesson we will learn: the terminology of elementary probability, elementary rules of probability,
More informationEcon 113. Lecture Module 2
Econ 113 Lecture Module 2 Contents 1. Experiments and definitions 2. Events and probabilities 3. Assigning probabilities 4. Probability of complements 5. Conditional probability 6. Statistical independence
More informationAn Entropy Bound for Random Number Generation
244 An Entropy Bound for Random Number Generation Sung-il Pae, Hongik University, Seoul, Korea Summary Many computer applications use random numbers as an important computational resource, and they often
More informationEE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.
EE376A - Information Theory Final, Monday March 14th 216 Solutions Instructions: You have three hours, 3.3PM - 6.3PM The exam has 4 questions, totaling 12 points. Please start answering each question on
More informationLecture 20 : Markov Chains
CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called
More informationLecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code
Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias
More informationLecture 1: Shannon s Theorem
Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work
More informationEE5585 Data Compression May 2, Lecture 27
EE5585 Data Compression May 2, 2013 Lecture 27 Instructor: Arya Mazumdar Scribe: Fangying Zhang Distributed Data Compression/Source Coding In the previous class we used a H-W table as a simple example,
More informationChapter 2: Random Variables
ECE54: Stochastic Signals and Systems Fall 28 Lecture 2 - September 3, 28 Dr. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter 2: Random Variables Example. Tossing a fair coin twice:
More informationProbability and random variables. Sept 2018
Probability and random variables Sept 2018 2 The sample space Consider an experiment with an uncertain outcome. The set of all possible outcomes is called the sample space. Example: I toss a coin twice,
More information2. AXIOMATIC PROBABILITY
IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop
More informationEE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16
EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt
More informationStatistics and Econometrics I
Statistics and Econometrics I Random Variables Shiu-Sheng Chen Department of Economics National Taiwan University October 5, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I October 5, 2016
More informationDiscrete Mathematics and Probability Theory Fall 2017 Ramchandran and Rao Midterm 2 Solutions
CS 70 Discrete Mathematics and Probability Theory Fall 2017 Ramchandran and Rao Midterm 2 Solutions PRINT Your Name: Oski Bear SIGN Your Name: OS K I PRINT Your Student ID: CIRCLE your exam room: Pimentel
More informationDiscrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20
CS 70 Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 Today we shall discuss a measure of how close a random variable tends to be to its expectation. But first we need to see how to compute
More informationP (E) = P (A 1 )P (A 2 )... P (A n ).
Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer
More informationGreat Theoretical Ideas in Computer Science
15-251 Great Theoretical Ideas in Computer Science Probability Theory: Counting in Terms of Proportions Lecture 10 (September 27, 2007) Some Puzzles Teams A and B are equally good In any one game, each
More informationCarleton University. Final Examination Fall DURATION: 2 HOURS No. of students: 223
Carleton University Final Examination Fall 2016 DURATION: 2 HOURS No. of students: 223 Department Name & Course Number: Computer Science COMP 2804A Course Instructor: Michiel Smid Authorized memoranda:
More informationIntro to Information Theory
Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here
More informationP (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).
Lectures 7-8 jacques@ucsdedu 41 Conditional Probability Let (Ω, F, P ) be a probability space Suppose that we have prior information which leads us to conclude that an event A F occurs Based on this information,
More information1. If X has density. cx 3 e x ), 0 x < 0, otherwise. Find the value of c that makes f a probability density. f(x) =
1. If X has density f(x) = { cx 3 e x ), 0 x < 0, otherwise. Find the value of c that makes f a probability density. 2. Let X have density f(x) = { xe x, 0 < x < 0, otherwise. (a) Find P (X > 2). (b) Find
More informationSTAT2201. Analysis of Engineering & Scientific Data. Unit 3
STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random
More informationMath 180B Homework 4 Solutions
Math 80B Homework 4 Solutions Note: We will make repeated use of the following result. Lemma. Let (X n ) be a time-homogeneous Markov chain with countable state space S, let A S, and let T = inf { n 0
More informationCSE525: Randomized Algorithms and Probabilistic Analysis April 2, Lecture 1
CSE525: Randomized Algorithms and Probabilistic Analysis April 2, 2013 Lecture 1 Lecturer: Anna Karlin Scribe: Sonya Alexandrova and Eric Lei 1 Introduction The main theme of this class is randomized algorithms.
More informationClassification & Information Theory Lecture #8
Classification & Information Theory Lecture #8 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 University of Massachusetts Amherst Andrew McCallum Today s Main Points Automatically categorizing
More information1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.
Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without
More informationLecture 7: DecisionTrees
Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:
More informationProbabilistic models
Probabilistic models Kolmogorov (Andrei Nikolaevich, 1903 1987) put forward an axiomatic system for probability theory. Foundations of the Calculus of Probabilities, published in 1933, immediately became
More information1 Review of The Learning Setting
COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #8 Scribe: Changyan Wang February 28, 208 Review of The Learning Setting Last class, we moved beyond the PAC model: in the PAC model we
More information02 Background Minimum background on probability. Random process
0 Background 0.03 Minimum background on probability Random processes Probability Conditional probability Bayes theorem Random variables Sampling and estimation Variance, covariance and correlation Probability
More informationProbability theory basics
Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:
More informationMutually Exclusive Events
172 CHAPTER 3 PROBABILITY TOPICS c. QS, 7D, 6D, KS Mutually Exclusive Events A and B are mutually exclusive events if they cannot occur at the same time. This means that A and B do not share any outcomes
More informationUNIT NUMBER PROBABILITY 6 (Statistics for the binomial distribution) A.J.Hobson
JUST THE MATHS UNIT NUMBER 19.6 PROBABILITY 6 (Statistics for the binomial distribution) by A.J.Hobson 19.6.1 Construction of histograms 19.6.2 Mean and standard deviation of a binomial distribution 19.6.3
More informationn N CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.)
CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.) S.T. is the key to understanding driving forces. e.g., determines if a process proceeds spontaneously. Let s start with entropy
More informationP [(E and F )] P [F ]
CONDITIONAL PROBABILITY AND INDEPENDENCE WORKSHEET MTH 1210 This worksheet supplements our textbook material on the concepts of conditional probability and independence. The exercises at the end of each
More informationChapter 1: Introduction to Probability Theory
ECE5: Stochastic Signals and Systems Fall 8 Lecture - September 6, 8 Prof. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter : Introduction to Probability Theory Axioms of Probability
More informationRecursive Estimation
Recursive Estimation Raffaello D Andrea Spring 08 Problem Set : Bayes Theorem and Bayesian Tracking Last updated: March, 08 Notes: Notation: Unless otherwise noted, x, y, and z denote random variables,
More informationA Brief Review of Probability, Bayesian Statistics, and Information Theory
A Brief Review of Probability, Bayesian Statistics, and Information Theory Brendan Frey Electrical and Computer Engineering University of Toronto frey@psi.toronto.edu http://www.psi.toronto.edu A system
More informationLecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability
Lecture Notes 1 Basic Probability Set Theory Elements of Probability Conditional probability Sequential Calculation of Probability Total Probability and Bayes Rule Independence Counting EE 178/278A: Basic
More informationPERFECT SECRECY AND ADVERSARIAL INDISTINGUISHABILITY
PERFECT SECRECY AND ADVERSARIAL INDISTINGUISHABILITY BURTON ROSENBERG UNIVERSITY OF MIAMI Contents 1. Perfect Secrecy 1 1.1. A Perfectly Secret Cipher 2 1.2. Odds Ratio and Bias 3 1.3. Conditions for Perfect
More informationEntropy as a measure of surprise
Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify
More informationSome Basic Concepts of Probability and Information Theory: Pt. 2
Some Basic Concepts of Probability and Information Theory: Pt. 2 PHYS 476Q - Southern Illinois University January 22, 2018 PHYS 476Q - Southern Illinois University Some Basic Concepts of Probability and
More informationDiscrete Random Variables
Chapter 5 Discrete Random Variables Suppose that an experiment and a sample space are given. A random variable is a real-valued function of the outcome of the experiment. In other words, the random variable
More informationProbability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2
Probability Probability is the study of uncertain events or outcomes. Games of chance that involve rolling dice or dealing cards are one obvious area of application. However, probability models underlie
More informationLecture Lecture 5
Lecture 4 --- Lecture 5 A. Basic Concepts (4.1-4.2) 1. Experiment: A process of observing a phenomenon that has variation in its outcome. Examples: (E1). Rolling a die, (E2). Drawing a card form a shuffled
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More informationDiscrete Random Variables
CPSC 53 Systems Modeling and Simulation Discrete Random Variables Dr. Anirban Mahanti Department of Computer Science University of Calgary mahanti@cpsc.ucalgary.ca Random Variables A random variable is
More informationLecture 9: Conditional Probability and Independence
EE5110: Probability Foundations July-November 2015 Lecture 9: Conditional Probability and Independence Lecturer: Dr. Krishna Jagannathan Scribe: Vishakh Hegde 9.1 Conditional Probability Definition 9.1
More informationExpected Value 7/7/2006
Expected Value 7/7/2006 Definition Let X be a numerically-valued discrete random variable with sample space Ω and distribution function m(x). The expected value E(X) is defined by E(X) = x Ω x m(x), provided
More informationAnnouncements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias
Recap Announcements Lecture 5: Statistics 101 Mine Çetinkaya-Rundel September 13, 2011 HW1 due TA hours Thursday - Sunday 4pm - 9pm at Old Chem 211A If you added the class last week please make sure to
More information6.042/18.062J Mathematics for Computer Science November 28, 2006 Tom Leighton and Ronitt Rubinfeld. Random Variables
6.042/18.062J Mathematics for Computer Science November 28, 2006 Tom Leighton and Ronitt Rubinfeld Lecture Notes Random Variables We ve used probablity to model a variety of experiments, games, and tests.
More informationLECTURE 1. 1 Introduction. 1.1 Sample spaces and events
LECTURE 1 1 Introduction The first part of our adventure is a highly selective review of probability theory, focusing especially on things that are most useful in statistics. 1.1 Sample spaces and events
More information1. Discrete Distributions
Virtual Laboratories > 2. Distributions > 1 2 3 4 5 6 7 8 1. Discrete Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an underlying sample space Ω.
More informationMidterm Exam 1 Solution
EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2015 Kannan Ramchandran September 22, 2015 Midterm Exam 1 Solution Last name First name SID Name of student on your left:
More informationMAT 271E Probability and Statistics
MAT 71E Probability and Statistics Spring 013 Instructor : Class Meets : Office Hours : Textbook : Supp. Text : İlker Bayram EEB 1103 ibayram@itu.edu.tr 13.30 1.30, Wednesday EEB 5303 10.00 1.00, Wednesday
More informationBandits, Experts, and Games
Bandits, Experts, and Games CMSC 858G Fall 2016 University of Maryland Intro to Probability* Alex Slivkins Microsoft Research NYC * Many of the slides adopted from Ron Jin and Mohammad Hajiaghayi Outline
More information2!3! (9)( 8x3 ) = 960x 3 ( 720x 3 ) = 1680x 3
CSCI 2200 Foundations of Computer Science Spring 2018 Quiz 2 (April 11, 2018) SOLUTIONS 1. [6 POINTS] An NBA team has nine players show up for a game, but only five players can be on the floor to play
More information