COMP2610/COMP Information Theory
|
|
- Roderick Carson
- 5 years ago
- Views:
Transcription
1 COMP2610/COMP Information Theory Lecture 9: Probabilistic Inequalities Mark Reid and Aditya Menon Research School of Computer Science The Australian National University August 19th, 2014 Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 1 / 37
2 Last time Mutual information chain rule Jensen s inequality Information cannot hurt Data processing inequality Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 2 / 37
3 Review: Data-Processing Inequality Theorem if X Y Z then: I (X ; Y ) I (X ; Z) X is the state of the world, Y is the data gathered and Z is the processed data No clever manipulation of the data can improve the inferences that can be made from the data No processing of Y, deterministic or random, can increase the information that Y contains about X Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 3 / 37
4 This time Markov s inequality Chebyshev s inequality Law of large numbers Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 4 / 37
5 Outline 1 Properties of expectation and variance 2 Markov s inequality 3 Chebyshev s inequality 4 Law of large numbers 5 Wrapping Up Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 5 / 37
6 1 Properties of expectation and variance 2 Markov s inequality 3 Chebyshev s inequality 4 Law of large numbers 5 Wrapping Up Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 6 / 37
7 Expectation and Variance Let X be a random variable over X, with probability distribution p Expected value: E[X ] = x X x p(x). Variance: V[X ] = E[(X E[X ]) 2 ] = E[X 2 ] (E[X ]) 2. Standard deviation is V[X ] Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 7 / 37
8 Properties of expectation A key property of expectations is linearity: [ n ] n E X i = E [X i ]. i=1 i=1 This holds even if the variables are dependent! We have for any a R, E[aX ] = a E[X ]. Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 8 / 37
9 Properties of variance We have linearity of variance for independent random variables: [ n ] n V X i = V [X i ]. i=1 i=1 Does not hold if the variables are dependent We have for any a R, V[aX ] = a 2 V[X ]. Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 9 / 37
10 1 Properties of expectation and variance 2 Markov s inequality 3 Chebyshev s inequality 4 Law of large numbers 5 Wrapping Up Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 10 / 37
11 Markov s Inequality Motivation 1000 school students sit an examination The busy principal is only told that the average score is 40 out of 100 The principal wants to estimate the maximum number of students who scored more than 80 Would it make sense to ask about the minimum number of students? Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 11 / 37
12 Markov s Inequality Motivation Call x the number of students who score more than 80 We know: = 80x + S, where S is the total score of students scoring less than 80 Exam scores are nonnegative, so certainly S 0 Thus, 80x , or x 500. Can we formalise this more generally? Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 12 / 37
13 Markov s Inequality Theorem Let X be a nonnegative random variable. Then, for any λ > 0, p(x λ) E[X ] λ. Bounds probability of observing a large outcome Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 13 / 37
14 Markov s Inequality Alternate Statement Corollary Let X be a nonnegative random variable. Then, for any λ > 0, p(x λ E[X ]) 1 λ. Observations of nonnegative random variable unlikely to be much larger than expected value Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 14 / 37
15 Markov s Inequality Proof E[X ] = x p(x) x X = x p(x) + x p(x) x<λ x λ x λ x p(x) nonneg. of random variable x λ λ p(x) = λ p(x λ). Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 15 / 37
16 Markov s Inequality Illustration from Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 16 / 37
17 Markov s Inequality Illustration from Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 17 / 37
18 Markov s Inequality Illustration from Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 18 / 37
19 Markov s Inequality Illustration from Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 19 / 37
20 1 Properties of expectation and variance 2 Markov s inequality 3 Chebyshev s inequality 4 Law of large numbers 5 Wrapping Up Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 20 / 37
21 Chebyshev s Inequality Motivation Markov s inequality only uses the mean of the distribution What about the spread of the distribution (variance)? Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 21 / 37
22 Chebyshev s Inequality Theorem Let X be a random variable with E[X ] <. Then, for any λ > 0, p( X E[X ] λ) V[X ] λ 2. Bounds the probability of observing an unexpected outcome Do not require non negativity Two-sided bound Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 22 / 37
23 Chebyshev s Inequality Alternate Statement Corollary Let X be a random variable with E[X ] <. Then, for any λ > 0, p( X E[X ] λ V[X ]) 1 λ 2. Observations are unlikely to occur several standard deviations away from the mean Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 23 / 37
24 Chebyshev s Inequality Proof Define Y = (X E[X ]) 2. Then, by Markov s inequality, for any ν > 0, But, Also, Thus, setting λ = ν, p(y ν) E[Y ] ν. E[Y ] = V[X ]. Y ν X E[X ] ν. p( X E[X ] λ) V[X ] λ 2. Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 24 / 37
25 Chebyshev s Inequality Illustration For a binomial with N trials and success probability θ, we have e.g. p( X Nθ 2Nθ(1 θ)) 1 2. Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 25 / 37
26 Chebyshev s Inequality Example Suppose we have a coin with bias θ, i.e. p(x = 1) = θ Say we flip the coin n times, and observe x 1,..., x n {0, 1} We use the maximum likelihood estimator of θ: ˆθ n = x x n n Estimate how large n should be such that 1% probability of a 5% error p( ˆθ n θ 0.05) 0.01? Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 26 / 37
27 Chebyshev s Inequality Example Observe that E[ˆθ n ] = V[ˆθ n ] = n i=1 E[x i] = θ n n i=1 V[x i] θ(1 θ) n 2 =. n Thus, applying Chebyshev s inequality to ˆθ n, p( ˆθ n θ > 0.05) We are guaranteed this is less than 0.01 if When θ = 0.5, n 10, 000! n θ(1 θ) (0.05) 2 (0.01). θ(1 θ) (0.05) 2 n. Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 27 / 37
28 1 Properties of expectation and variance 2 Markov s inequality 3 Chebyshev s inequality 4 Law of large numbers 5 Wrapping Up Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 28 / 37
29 Independent and Identically Distributed Let X 1,..., X n be random variables such that: Each X i is independent of X j The distribution of X i is the same as that of X j Then, we say that X 1,..., X n are independent and identically distributed (or iid) Example: For n independent flips of an unbiased coin, X 1,..., X n are iid from Bernoulli ( ) 1 2 Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 29 / 37
30 Law of Large Numbers Theorem Let X 1,..., X n be a sequence of iid random variables, with and V[X i ] <. Define E[X i ] = µ Then, for any ɛ > 0, X n = X X n. n lim p( X n µ > ɛ) = 0. n Given enough trials, the empirical success frequency will be close to the expected value Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 30 / 37
31 Law of Large Numbers Proof Since X i s are identically distributed, E[ X n ] = µ. Since the X i s are independent, [ ] X X n V[ X n ] = V n = V [X X n ] n 2 = nσ2 n 2 = σ2 n. Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 31 / 37
32 Law of Large Numbers Proof Applying Chebyshev s inequality to X n, p( X n µ ɛ) V[ X n ] ɛ 2 = σ2 nɛ 2. As n, the right hand side 0. Thus, p( X n µ < ɛ) 1. Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 32 / 37
33 Law of Large Numbers Illustration N = 1000 trials with Bernoulli random variable with parameter Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 33 / 37
34 Law of Large Numbers Illustration N = trials with Bernoulli random variable with parameter x 10 Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 34 / 37
35 1 Properties of expectation and variance 2 Markov s inequality 3 Chebyshev s inequality 4 Law of large numbers 5 Wrapping Up Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 35 / 37
36 Summary & Conclusions Markov s inequality Chebyshev s inequality Law of large numbers Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 36 / 37
37 Next time Ensembles and sequences Typical sets Approximation Equipartition (AEP) Mark Reid and Aditya Menon (ANU) COMP2610/COMP Information Theory Semester 2 37 / 37
MAS113 Introduction to Probability and Statistics
MAS113 Introduction to Probability and Statistics School of Mathematics and Statistics, University of Sheffield 2018 19 Identically distributed Suppose we have n random variables X 1, X 2,..., X n. Identically
More informationKousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13
Discrete Mathematics & Mathematical Reasoning Chapter 7 (continued): Markov and Chebyshev s Inequalities; and Examples in probability: the birthday problem Kousha Etessami U. of Edinburgh, UK Kousha Etessami
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationMidterm #1. Lecture 10: Joint Distributions and the Law of Large Numbers. Joint Distributions - Example, cont. Joint Distributions - Example
Midterm #1 Midterm 1 Lecture 10: and the Law of Large Numbers Statistics 104 Colin Rundel February 0, 01 Exam will be passed back at the end of class Exam was hard, on the whole the class did well: Mean:
More informationLecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality
Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek Bhrushundi
More informationPractice Problem - Skewness of Bernoulli Random Variable. Lecture 7: Joint Distributions and the Law of Large Numbers. Joint Distributions - Example
A little more E(X Practice Problem - Skewness of Bernoulli Random Variable Lecture 7: and the Law of Large Numbers Sta30/Mth30 Colin Rundel February 7, 014 Let X Bern(p We have shown that E(X = p Var(X
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationCOMPSCI 240: Reasoning Under Uncertainty
COMPSCI 240: Reasoning Under Uncertainty Andrew Lan and Nic Herndon University of Massachusetts at Amherst Spring 2019 Lecture 20: Central limit theorem & The strong law of large numbers Markov and Chebyshev
More informationX = X X n, + X 2
CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationLecture 2 Sep 5, 2017
CS 388R: Randomized Algorithms Fall 2017 Lecture 2 Sep 5, 2017 Prof. Eric Price Scribe: V. Orestis Papadigenopoulos and Patrick Rall NOTE: THESE NOTES HAVE NOT BEEN EDITED OR CHECKED FOR CORRECTNESS 1
More informationLecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
Lecture 8 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University October 22, 2007 1 2 3 4 5 6 1 Define convergent series 2 Define the Law of Large Numbers
More informationIntroduction to Probability
LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute
More informationExample continued. Math 425 Intro to Probability Lecture 37. Example continued. Example
continued : Coin tossing Math 425 Intro to Probability Lecture 37 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan April 8, 2009 Consider a Bernoulli trials process with
More informationProbability and Statistics
Probability and Statistics Jane Bae Stanford University hjbae@stanford.edu September 16, 2014 Jane Bae (Stanford) Probability and Statistics September 16, 2014 1 / 35 Overview 1 Probability Concepts Probability
More informationLecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN
Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and
More informationCSE 312 Final Review: Section AA
CSE 312 TAs December 8, 2011 General Information General Information Comprehensive Midterm General Information Comprehensive Midterm Heavily weighted toward material after the midterm Pre-Midterm Material
More informationExpectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,
More informationDiscrete Probability Refresher
ECE 1502 Information Theory Discrete Probability Refresher F. R. Kschischang Dept. of Electrical and Computer Engineering University of Toronto January 13, 1999 revised January 11, 2006 Probability theory
More informationLecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019
Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial
More informationLecture 4: Sampling, Tail Inequalities
Lecture 4: Sampling, Tail Inequalities Variance and Covariance Moment and Deviation Concentration and Tail Inequalities Sampling and Estimation c Hung Q. Ngo (SUNY at Buffalo) CSE 694 A Fun Course 1 /
More informationExpectation of Random Variables
1 / 19 Expectation of Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 13, 2015 2 / 19 Expectation of Discrete
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationExpectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda
Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,
More informationDiscrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20
CS 70 Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 Today we shall discuss a measure of how close a random variable tends to be to its expectation. But first we need to see how to compute
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More informationLecture 11. Probability Theory: an Overveiw
Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the
More informationCSE 312: Foundations of Computing II Quiz Section #10: Review Questions for Final Exam (solutions)
CSE 312: Foundations of Computing II Quiz Section #10: Review Questions for Final Exam (solutions) 1. (Confidence Intervals, CLT) Let X 1,..., X n be iid with unknown mean θ and known variance σ 2. Assume
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More information6 The normal distribution, the central limit theorem and random samples
6 The normal distribution, the central limit theorem and random samples 6.1 The normal distribution We mentioned the normal (or Gaussian) distribution in Chapter 4. It has density f X (x) = 1 σ 1 2π e
More informationFundamental Tools - Probability Theory IV
Fundamental Tools - Probability Theory IV MSc Financial Mathematics The University of Warwick October 1, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory IV 1 / 14 Model-independent
More informationExpectation is linear. So far we saw that E(X + Y ) = E(X) + E(Y ). Let α R. Then,
Expectation is linear So far we saw that E(X + Y ) = E(X) + E(Y ). Let α R. Then, E(αX) = ω = ω (αx)(ω) Pr(ω) αx(ω) Pr(ω) = α ω X(ω) Pr(ω) = αe(x). Corollary. For α, β R, E(αX + βy ) = αe(x) + βe(y ).
More informationLecture 18: Central Limit Theorem. Lisa Yan August 6, 2018
Lecture 18: Central Limit Theorem Lisa Yan August 6, 2018 Announcements PS5 due today Pain poll PS6 out today Due next Monday 8/13 (1:30pm) (will not be accepted after Wed 8/15) Programming part: Java,
More informationLecture Notes 3 Convergence (Chapter 5)
Lecture Notes 3 Convergence (Chapter 5) 1 Convergence of Random Variables Let X 1, X 2,... be a sequence of random variables and let X be another random variable. Let F n denote the cdf of X n and let
More informationDS-GA 1002 Lecture notes 11 Fall Bayesian statistics
DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian
More informationExpectation of geometric distribution
Expectation of geometric distribution What is the probability that X is finite? Can now compute E(X): Σ k=1f X (k) = Σ k=1(1 p) k 1 p = pσ j=0(1 p) j = p 1 1 (1 p) = 1 E(X) = Σ k=1k (1 p) k 1 p = p [ Σ
More informationDiscrete Probability distribution Discrete Probability distribution
438//9.4.. Discrete Probability distribution.4.. Binomial P.D. The outcomes belong to either of two relevant categories. A binomial experiment requirements: o There is a fixed number of trials (n). o On
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationNotation: X = random variable; x = particular value; P(X = x) denotes probability that X equals the value x.
Ch. 16 Random Variables Def n: A random variable is a numerical measurement of the outcome of a random phenomenon. A discrete random variable is a random variable that assumes separate values. # of people
More informationTwelfth Problem Assignment
EECS 401 Not Graded PROBLEM 1 Let X 1, X 2,... be a sequence of independent random variables that are uniformly distributed between 0 and 1. Consider a sequence defined by (a) Y n = max(x 1, X 2,..., X
More informationan introduction to bayesian inference
with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena
More informationLecture 4: September Reminder: convergence of sequences
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused
More informationSTAT 285: Fall Semester Final Examination Solutions
Name: Student Number: STAT 285: Fall Semester 2014 Final Examination Solutions 5 December 2014 Instructor: Richard Lockhart Instructions: This is an open book test. As such you may use formulas such as
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More information(Practice Version) Midterm Exam 2
EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2014 Kannan Ramchandran November 7, 2014 (Practice Version) Midterm Exam 2 Last name First name SID Rules. DO NOT open
More informationEE514A Information Theory I Fall 2013
EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationLecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages
Lecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages ELEC206 Probability and Random Processes, Fall 2014 Gil-Jin Jang gjang@knu.ac.kr School of EE, KNU page 1 / 15 Chapter 7. Sums of Random
More informationName: Matriculation Number: Tutorial Group: A B C D E
Name: Matriculation Number: Tutorial Group: A B C D E Question: 1 (5 Points) 2 (6 Points) 3 (5 Points) 4 (5 Points) Total (21 points) Score: General instructions: The written test contains 4 questions
More informationStatistics II Lesson 1. Inference on one population. Year 2009/10
Statistics II Lesson 1. Inference on one population Year 2009/10 Lesson 1. Inference on one population Contents Introduction to inference Point estimators The estimation of the mean and variance Estimating
More informationConfidence Intervals and Hypothesis Tests
Confidence Intervals and Hypothesis Tests STA 281 Fall 2011 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes.
More informationU Logo Use Guidelines
Information Theory Lecture 3: Applications to Machine Learning U Logo Use Guidelines Mark Reid logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature of things. authenticity
More informationLimit Theorems. STATISTICS Lecture no Department of Econometrics FEM UO Brno office 69a, tel
STATISTICS Lecture no. 6 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 3. 11. 2009 If we repeat some experiment independently we can create using given
More informationProbability Models. 4. What is the definition of the expectation of a discrete random variable?
1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions
More information5.2 Fisher information and the Cramer-Rao bound
Stat 200: Introduction to Statistical Inference Autumn 208/9 Lecture 5: Maximum likelihood theory Lecturer: Art B. Owen October 9 Disclaimer: These notes have not been subjected to the usual scrutiny reserved
More informationExpectation of geometric distribution. Variance and Standard Deviation. Variance: Examples
Expectation of geometric distribution Variance and Standard Deviation What is the probability that X is finite? Can now compute E(X): Σ k=f X (k) = Σ k=( p) k p = pσ j=0( p) j = p ( p) = E(X) = Σ k=k (
More informationMathematical Statistics 1 Math A 6330
Mathematical Statistics 1 Math A 6330 Chapter 3 Common Families of Distributions Mohamed I. Riffi Department of Mathematics Islamic University of Gaza September 28, 2015 Outline 1 Subjects of Lecture 04
More informationSome Asymptotic Bayesian Inference (background to Chapter 2 of Tanner s book)
Some Asymptotic Bayesian Inference (background to Chapter 2 of Tanner s book) Principal Topics The approach to certainty with increasing evidence. The approach to consensus for several agents, with increasing
More informationChapter 8. Some Approximations to Probability Distributions: Limit Theorems
Chapter 8. Some Approximations to Probability Distributions: Limit Theorems Sections 8.2 -- 8.3: Convergence in Probability and in Distribution Jiaping Wang Department of Mathematical Science 04/22/2013,
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium November 12, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationPoint Estimation. Vibhav Gogate The University of Texas at Dallas
Point Estimation Vibhav Gogate The University of Texas at Dallas Some slides courtesy of Carlos Guestrin, Chris Bishop, Dan Weld and Luke Zettlemoyer. Basics: Expectation and Variance Binary Variables
More informationMathematics Qualifying Examination January 2015 STAT Mathematical Statistics
Mathematics Qualifying Examination January 2015 STAT 52800 - Mathematical Statistics NOTE: Answer all questions completely and justify your derivations and steps. A calculator and statistical tables (normal,
More informationTopic 7: Convergence of Random Variables
Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information
More informationIE 230 Probability & Statistics in Engineering I. Closed book and notes. 120 minutes.
Closed book and notes. 10 minutes. Two summary tables from the concise notes are attached: Discrete distributions and continuous distributions. Eight Pages. Score _ Final Exam, Fall 1999 Cover Sheet, Page
More information3. Review of Probability and Statistics
3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture
More informationU Logo Use Guidelines
COMP2610/6261 - Information Theory Lecture 15: Arithmetic Coding U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature
More informationProbability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27
Probability Review Yutian Li Stanford University January 18, 2018 Yutian Li (Stanford University) Probability Review January 18, 2018 1 / 27 Outline 1 Elements of probability 2 Random variables 3 Multiple
More informationE X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.
E X A M Course code: Course name: Number of pages incl. front page: 6 MA430-G Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours Resources allowed: Notes: Pocket calculator,
More informationSUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)
SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems
More informationLecture 13 Fundamentals of Bayesian Inference
Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up
More informationParametric Models: from data to models
Parametric Models: from data to models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Jan 22, 2018 Recall: Model-based ML DATA MODEL LEARNING MODEL MODEL INFERENCE KNOWLEDGE Learning:
More informationLecture 2: Review of Basic Probability Theory
ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department
More informationTHEORY OF PROBABILITY VLADIMIR KOBZAR
THEORY OF PROBABILITY VLADIMIR KOBZAR Lecture 20 - Conditional Expectation, Inequalities, Laws of Large Numbers, Central Limit Theorem This lecture is based on the materials from the Courant Institute
More informationReview of Discrete Probability (contd.)
Stat 504, Lecture 2 1 Review of Discrete Probability (contd.) Overview of probability and inference Probability Data generating process Observed data Inference The basic problem we study in probability:
More informationReview of Probabilities and Basic Statistics
Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to
More informationBernoulli and Binomial Distributions. Notes. Bernoulli Trials. Bernoulli/Binomial Random Variables Bernoulli and Binomial Distributions.
Lecture 11 Text: A Course in Probability by Weiss 5.3 STAT 225 Introduction to Probability Models February 16, 2014 Whitney Huang Purdue University 11.1 Agenda 1 2 11.2 Bernoulli trials Many problems in
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More informationBasic Probability. Introduction
Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with
More informationRegression #3: Properties of OLS Estimator
Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with
More informationRandom Variable. Pr(X = a) = Pr(s)
Random Variable Definition A random variable X on a sample space Ω is a real-valued function on Ω; that is, X : Ω R. A discrete random variable is a random variable that takes on only a finite or countably
More information2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.
CS450 Final Review Problems Fall 08 Solutions or worked answers provided Problems -6 are based on the midterm review Identical problems are marked recap] Please consult previous recitations and textbook
More informationLecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write
Lecture 3: Expected Value 1.) Definitions. If X 0 is a random variable on (Ω, F, P), then we define its expected value to be EX = XdP. Notice that this quantity may be. For general X, we say that EX exists
More informationQuiz 1. Name: Instructions: Closed book, notes, and no electronic devices.
Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More informationLecture 22: Final Review
Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationMethods of Estimation
Methods of Estimation MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Methods of Estimation I 1 Methods of Estimation I 2 X X, X P P = {P θ, θ Θ}. Problem: Finding a function θˆ(x ) which is close to θ.
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationRefresher on Discrete Probability
Refresher on Discrete Probability STAT 27725/CMSC 25400: Machine Learning Shubhendu Trivedi University of Chicago October 2015 Background Things you should have seen before Events, Event Spaces Probability
More informationCS 591, Lecture 2 Data Analytics: Theory and Applications Boston University
CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University Charalampos E. Tsourakakis January 25rd, 2017 Probability Theory The theory of probability is a system for making better guesses.
More informationClosed book and notes. 120 minutes. Cover page, five pages of exam. No calculators.
IE 230 Seat # Closed book and notes. 120 minutes. Cover page, five pages of exam. No calculators. Score Final Exam, Spring 2005 (May 2) Schmeiser Closed book and notes. 120 minutes. Consider an experiment
More informationChapter 3: Random Variables 1
Chapter 3: Random Variables 1 Yunghsiang S. Han Graduate Institute of Communication Engineering, National Taipei University Taiwan E-mail: yshan@mail.ntpu.edu.tw 1 Modified from the lecture notes by Prof.
More informationStat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.
Stat 260 - Lecture 20 Recap of Last Class Last class we introduced the covariance and correlation between two jointly distributed random variables. Today: We will introduce the idea of a statistic and
More informationOverview. Confidence Intervals Sampling and Opinion Polls Error Correcting Codes Number of Pet Unicorns in Ireland
Overview Confidence Intervals Sampling and Opinion Polls Error Correcting Codes Number of Pet Unicorns in Ireland Confidence Intervals When a random variable lies in an interval a X b with a specified
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More information