Confidence Intervals
|
|
- Matthew Casey
- 5 years ago
- Views:
Transcription
1 Quantitative Foundations Project 3 Instructor: Linwei Wang Confidence Intervals Contents 1 Introduction Warning Goals of Statistics Random Variables Distributions Expectations Variance Exercise More Information The Central Limit Theorem Averaging Variables Entropy The Normal Distribution Central Limit Theorem Summary More Information Confidence Intervals Definition Exact Intervals using Hoeffding s Inequality
2 Confidence Intervals Asymptotic Intervals Using the Normal Distribution Known σ Unknown σ More Information
3 Confidence Intervals 3 1 Introduction 1.1 Warning By necessity, we will need to use many concepts here such as random variable, independent or probability density without defining them in a mathematically rigorous way. (Which would take the entire time alloted for this section!) Feel free to ask questions, and we will try to convey the meanings of these concepts at a level needed for working knowledge, but in the end, it is your responsibility to fill gaps in your own background as needed. 1.2 Goals of Statistics Let s begin by talking about the basic goals of statistics. Let s take a very simple example. Suppose we have a weighted 4-sided die, which has some probability of returning each number in 1, 2,...,4. Thus, we can think of that die as a simple probability distribution: p 1 x =1 p 2 x =2 p(x) = p 3 x =3 p 4 x =4 So, if we knew the values of p 1,...,p 4 we would know everything we care about the die. Now, in order to be a valid probability distribution, there are a couple of obvious conditions that must be satisfied. First, probabilities must be non-negative, that is i, p i 0. Secondly, we have to get some number. This means that p 1 + p 2 + p 3 + p 4 =1. Any set of numbers p 1,...,p 4 constitutes a valid probability distribution. Now, suppose that we flip the die a bunch of times, and we get the following result: 4, 1, 3, 2, 1, 2, 4, 2, 1, 1. Example statistical problem 1: What is p? Can you think of a way to estimate it? An obvious solution would be to make a histogram, with probability proportional to the
4 Confidence Intervals 4 number of times each flip occurred. This yields the estimated distribution.4 x =1.3 x =2 ˆp(x) =..1 x =3.2 x =4 This is a reasonable guess, but there are some obvious problems here. In this particular case, perhaps we just happened to get more results of x =1by chance. Obviously, we can t expect that the above probabilities are the true ones. For example, simulating another dataset of size 10, I get the data: along with the estimated distribution 2, 1, 1, 2, 4, 4, 2, 4, 1, 3.3 x =1.3 x =2 ˆp(x) =..1 x =3.3 x =4 If we really consider the situation, we can t make any rigorous guarantees on the difference of our estimated distribution to the true one. Maybe the dice rolls we got just happened to be highly unusual! Let s consider another example. Suppose we are interested in the probability p(x) that a person has a given height. Now, notice a worrisome technical difficulty here. If we pick any particular height, say cm, it seems exceedingly unlikely that we will ever find a person with such a height. Rather, for continuous variables, we should formally speak of probability densities, not probability distributions. This means, we are looking for a function p(x) such that Pr[a X b] = b x=a p(x) dx. That means, we get real probabilities by integrating a probability density. In particular, notice that it is possible for a probability density to be greater than one. (E.g. the density p(x) =ai[0 x 1/a] can have any arbitrarily high value a.) In any case, probability densities obey similar rules to probability distributions, namely x, p(x) 0
5 Confidence Intervals 5 and + p(x) =1. x= Anyway, suppose that we go out onto the street, get a set of 10 random people, and measure their heights in centimeters. We might get data like the following: 154, 192, 145, 101, 155, Now, we might ask to recover the original probability density p(x). However, we might also be interested in only aspects of the distribution. For example, we might only care about the mean of p, µ = + x= xp(x) dx. Example statistical problem 2: What is µ? Can you think of a way to estimate it? The mean of the above dataset is But, of course, we want to know the true mean, which is presumably different. So, rather than simply reporting the mean, we should report some sort of guarantee of its reliability. It would be really nice, if we could make a statement like the following: The true mean is in the range The problem is, we can t do that! We could have gotten really unlucky in our dataset. In principle (knowing nothing about the real heights of humans on earth) the true mean height could be 50, and we just happened to be very unlucky and get unusually tall people when we collected our data. Thus, in statistics, we will have to resign ourselves to fundamentally weaker guarantees. Roughly speaking, we will make guarantees of the following type: Unless we were unlucky, the true mean is in the range We will even go on to quantify exactly what unlucky means and how unlucky we would have to be. That is, we will ultimately make a guarantee like this: A 95% confidence interval for the true mean is the range Now, notice: This does NOT mean that there is a 95% probability the true mean is in the range (If you remember one thing about statistics from this course, let it be this!) The true mean is a fixed number. We don t happen to know it, but it is out there in the world, and it is what it is.
6 Confidence Intervals 6 Rather, what we are saying is this: We have a procedure for building these things we call confidence intervals. The guarantee we make is precisely this: If you go out into the world and collect data, and then build confidence intervals, than 95% of the time your confidence interval will contain the true mean. That s all the guarantee they make. It isn t really the guarantee we would like to make. It is awkward. In real life, you do one experiment, and you want to know what the mean is. A confidence interval doesn t tell you what you want to know. We compute confidence intervals because they are the thing we are able to compute, not because they are the thing we want to compute. The rest of these notes will concentrate on background material to get your statistical brain muscles warmed-up. 1.3 Random Variables Very informally, a random variable is a number that comes from a random event. Example: Flip a coin 7 times, and let X be the number of heads that come up. Example: Gather data on the heights of 15 people, and let X be the mean measured heights. You will come to appreciate the purpose of random variables in time. 1.4 Distributions A variable has a uniform distribution if its probability density is given by, for some numbers a<b 1 a x b b a p(x) =. 0 else A variable has a Bernoulli distribution if its probability distribution is given by for some number θ [0, 1] p(x) = θ x =1 1 θ x =0. A variable has a Normal or Gaussian distribution if it is given by, for some numbers µ and σ>0
7 Confidence Intervals 7 p(x) σ 2π exp( 1 2σ 2 (x µ)2 ) The normal is extremely important because (as you might imagine from the name) many phenomena tend to have a Normal or approximately Normal distribution. Exercise: Draw some data of sizes 10, 100, 1000, and from each of these three distributions. Calculate a histogram in each case. Calculate the mean of your data. Do you notice anything funny? 1.5 Expectations Given a random variable X, its expected value is defined as E[X] = xp(x)dx x if Xis continuous, and E[X] = x xp(x) if X is discrete. Exercise: Suppose X is uniform. What is the expected value? (Answer: xp(x)dx = b xp(x) = b x 1 = 1 b x 1 a a b a b a a 2 b a x2 b a 1 2 b a (b2 a 2 ) (b+a)(b a) (b + a)) 2 (b a) 2 Exercise: Suppose X is Bernoulli. What is the expected value? (Answer: 0p(0)+1p(1) = θ) Exercise: Suppose X is Normal. What is the expected value? (Answer: calculus gets ugly. However, clearly by symmetry the answer is µ) An important property of expectations is that Theorem 1. The expected value of the sum of a finite number of random variables is the sum of expected values, i.e. n n E[ X i ]= E[X i ] Note that this theorem does not assume anything about the random variables (other than that the expected values exist). In particular, we do not assume that they are independent.
8 Confidence Intervals 8 Exercise: Prove this, for the case of two continuous random variables. Answer: E[X 1 + X 2 ] = (x 1 + x 2 )p(x 1,x 2 )dx 1 dx 2 x 1 x 2 = x 1 p(x 1,x 2 )dx 1 dx 2 + x 2 p(x 1,x 2 )dx 1 dx 2 x 1 x 2 x 1 x 2 = x 1 p(x 1 )dx 1 + x 2 p(x 2 )dx 2 x 1 x 2 = E[X 1 ]+E[X 2 ] A second, easy property of expectations is this: Theorem 2. The expected value of a constant times a random variable is that constant times the expected value, i.e. E[aX] =ae[x] Exercise: Prove this. Answer (for continuous variables): E[aX] = ax p(x) dx x = a xp(x) dx x = ae[x] Another important property, which is true only for independent random variables, is this: Theorem 3. The expected value of product of a finite number of random variables is the product of expected values, i.e. n n E[ X i ]= E[X i ] Exercise: Prove this for the case of two variables. Answer: E[X 1 X 2 ] = x 1 x 2 p(x 1,x 2 )dx 1 dx 2 x 1 x 2 = x 1 x 2 p(x 1 )p(x 2 )dx 1 dx 2 (using independence) x 1 x 2 = x 1 p(x 1 )dx 1 x 2 p(x 2 )dx 2 x 1 x 2 = E[X 1 ]E[X 2 ]
9 Confidence Intervals Variance Given some random variable X with mean µ, the variance is defined to be V[X] =E[(X µ)] 2, where µ = E[X]. A standard and useful result is that Theorem 4. V[X] =E[X 2 ] µ 2 Exercise: Prove this. (Answer: V[X] =E[(X µ)] 2 = E[X 2 2Xµ+µ 2 ]=E[X 2 ] 2µE[X]+ µ 2 = E[X 2 ] µ 2.) 1.7 Exercise Exercise: Suppose we have a dataset of size N, generated from a Bernoulli distribution (i.e. a bent coin). Let the data be X 1,X 2,X 3,...,X N. Suppose we want to estimate the parameter θ for this distribution. The obvious estimator for this would be θ N N X i. N That is, we estimate the bias of the coin to be exactly the fraction of the data that resulted in a head. Part 1: What is the expected value of θ N? Part 2: What is the variance of θ N? Part 3: Simulate this estimator and calculate its variance. Specifically, write a function that takes a value of θ and N. Generate Make sure that your simulation actually displays the variance you calculated. Answer to part 1:
10 Confidence Intervals 10 Eθ N = E 1 N N X i N N E X i (by Theorem 2) N N = θ N EX i (by Theorem 1) N θ
11 Confidence Intervals 11 Answer to part 2, in laborious detail: Vθ N = E[(θ N µ) 2 ] = E[θ 2 N] µ 2 (by Theorem 4) = E[θ 2 N] θ 2 E[θ 2 N] = E[ 1 N N 2] X i N E[ X 2 i X j ] i j N E[ X 2 i X j ] i j N E[ X 2 i 2 + X i X j ] (split into two groups) i i j=i N E[ X 2 i + X i X j ] (since 0 2 =0and 1 2 =1) i i j=i E[X N 2 i ]+ E[X i X j ] (by Theorem 1) i i j=i E[X N 2 i ]+ E[X i ]E[X j ] (by Theorem 3) i i j=i θ + θ 2 N 2 i i j=i Nθ + N(N 1)θ 2 N 2 N Thus, finally, the variance is (N 1) θ + N θ2 Vθ N (N 1) θ + N N θ2 θ 2 (N 1) θ + N N θ2 N N θ2 N θ 1 N θ2 N (θ θ2 )
12 Confidence Intervals 12 In particular, over the [0, 1] interval, θ θ 2 is maximized at 1 with θ 2 θ2, and so 4 Answer to part 3: Vθ N 1 4N. function estimate_bernoulli_variance(theta,n) maxrep 0000; theta_est = zeros(maxrep,1); for rep :maxrep X = sample_bernoulli(n,theta); theta_est(rep) = mean(x); end [mean(theta_est) theta] [var(theta_est) (1/N)*(theta-theta^2)] 1.8 More Information See Arian Maleki and Tom Do s review of probability theory..
Basic Probability Reference Sheet
February 27, 2001 Basic Probability Reference Sheet 17.846, 2001 This is intended to be used in addition to, not as a substitute for, a textbook. X is a random variable. This means that X is a variable
More informationFourier and Stats / Astro Stats and Measurement : Stats Notes
Fourier and Stats / Astro Stats and Measurement : Stats Notes Andy Lawrence, University of Edinburgh Autumn 2013 1 Probabilities, distributions, and errors Laplace once said Probability theory is nothing
More informationDiscrete Random Variables
Discrete Random Variables We have a probability space (S, Pr). A random variable is a function X : S V (X ) for some set V (X ). In this discussion, we must have V (X ) is the real numbers X induces a
More informationPHYSICS 15a, Fall 2006 SPEED OF SOUND LAB Due: Tuesday, November 14
PHYSICS 15a, Fall 2006 SPEED OF SOUND LAB Due: Tuesday, November 14 GENERAL INFO The goal of this lab is to determine the speed of sound in air, by making measurements and taking into consideration the
More informationCS 361: Probability & Statistics
February 19, 2018 CS 361: Probability & Statistics Random variables Markov s inequality This theorem says that for any random variable X and any value a, we have A random variable is unlikely to have an
More informationJoint Probability Distributions and Random Samples (Devore Chapter Five)
Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete
More informationA Primer on Statistical Inference using Maximum Likelihood
A Primer on Statistical Inference using Maximum Likelihood November 3, 2017 1 Inference via Maximum Likelihood Statistical inference is the process of using observed data to estimate features of the population.
More informationBasic Probability. Introduction
Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with
More informationCalculus II. Calculus II tends to be a very difficult course for many students. There are many reasons for this.
Preface Here are my online notes for my Calculus II course that I teach here at Lamar University. Despite the fact that these are my class notes they should be accessible to anyone wanting to learn Calculus
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationLecture 12: Quality Control I: Control of Location
Lecture 12: Quality Control I: Control of Location 10 October 2005 This lecture and the next will be about quality control methods. There are two reasons for this. First, it s intrinsically important for
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10
EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped
More informationN/4 + N/2 + N = 2N 2.
CS61B Summer 2006 Instructor: Erin Korber Lecture 24, 7 Aug. 1 Amortized Analysis For some of the data structures we ve discussed (namely hash tables and splay trees), it was claimed that the average time
More informationChapter 1 Review of Equations and Inequalities
Chapter 1 Review of Equations and Inequalities Part I Review of Basic Equations Recall that an equation is an expression with an equal sign in the middle. Also recall that, if a question asks you to solve
More informationSUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)
SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems
More informationDistributions of linear combinations
Distributions of linear combinations CE 311S MORE THAN TWO RANDOM VARIABLES The same concepts used for two random variables can be applied to three or more random variables, but they are harder to visualize
More informationQuadratic Equations Part I
Quadratic Equations Part I Before proceeding with this section we should note that the topic of solving quadratic equations will be covered in two sections. This is done for the benefit of those viewing
More informationLecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality
Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek Bhrushundi
More informationDIFFERENTIAL EQUATIONS
DIFFERENTIAL EQUATIONS Basic Concepts Paul Dawkins Table of Contents Preface... Basic Concepts... 1 Introduction... 1 Definitions... Direction Fields... 8 Final Thoughts...19 007 Paul Dawkins i http://tutorial.math.lamar.edu/terms.aspx
More informationStatistics 100A Homework 5 Solutions
Chapter 5 Statistics 1A Homework 5 Solutions Ryan Rosario 1. Let X be a random variable with probability density function a What is the value of c? fx { c1 x 1 < x < 1 otherwise We know that for fx to
More informationCS 361: Probability & Statistics
March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the
More informationData Analysis and Monte Carlo Methods
Lecturer: Allen Caldwell, Max Planck Institute for Physics & TUM Recitation Instructor: Oleksander (Alex) Volynets, MPP & TUM General Information: - Lectures will be held in English, Mondays 16-18:00 -
More informationOriginality in the Arts and Sciences: Lecture 2: Probability and Statistics
Originality in the Arts and Sciences: Lecture 2: Probability and Statistics Let s face it. Statistics has a really bad reputation. Why? 1. It is boring. 2. It doesn t make a lot of sense. Actually, the
More informationInstructor (Brad Osgood)
TheFourierTransformAndItsApplications-Lecture26 Instructor (Brad Osgood): Relax, but no, no, no, the TV is on. It's time to hit the road. Time to rock and roll. We're going to now turn to our last topic
More informationLecture 5. 1 Review (Pairwise Independence and Derandomization)
6.842 Randomness and Computation September 20, 2017 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Tom Kolokotrones 1 Review (Pairwise Independence and Derandomization) As we discussed last time, we can
More information3 Multiple Discrete Random Variables
3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f
More informationSolving with Absolute Value
Solving with Absolute Value Who knew two little lines could cause so much trouble? Ask someone to solve the equation 3x 2 = 7 and they ll say No problem! Add just two little lines, and ask them to solve
More informationCS 361: Probability & Statistics
October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite
More informationSTA 111: Probability & Statistical Inference
STA 111: Probability & Statistical Inference Lecture Four Expectation and Continuous Random Variables Instructor: Olanrewaju Michael Akande Department of Statistical Science, Duke University Instructor:
More informationX = X X n, + X 2
CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk
More informationStat 20 Midterm 1 Review
Stat 20 Midterm Review February 7, 2007 This handout is intended to be a comprehensive study guide for the first Stat 20 midterm exam. I have tried to cover all the course material in a way that targets
More informationCSC236 Week 3. Larry Zhang
CSC236 Week 3 Larry Zhang 1 Announcements Problem Set 1 due this Friday Make sure to read Submission Instructions on the course web page. Search for Teammates on Piazza Educational memes: http://www.cs.toronto.edu/~ylzhang/csc236/memes.html
More informationLine Integrals and Path Independence
Line Integrals and Path Independence We get to talk about integrals that are the areas under a line in three (or more) dimensional space. These are called, strangely enough, line integrals. Figure 11.1
More informationCommunication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi
Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking
More informationWeek 12-13: Discrete Probability
Week 12-13: Discrete Probability November 21, 2018 1 Probability Space There are many problems about chances or possibilities, called probability in mathematics. When we roll two dice there are possible
More informationPhysics 6720 Introduction to Statistics April 4, 2017
Physics 6720 Introduction to Statistics April 4, 2017 1 Statistics of Counting Often an experiment yields a result that can be classified according to a set of discrete events, giving rise to an integer
More informationLecture 4: September Reminder: convergence of sequences
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused
More informationT has many other desirable properties, and we will return to this example
2. Introduction to statistics: first examples 2.1. Introduction. The basic problem of statistics is to draw conclusions about unknown distributions of random variables from observed values. These conclusions
More informationP (A) = P (B) = P (C) = P (D) =
STAT 145 CHAPTER 12 - PROBABILITY - STUDENT VERSION The probability of a random event, is the proportion of times the event will occur in a large number of repititions. For example, when flipping a coin,
More informationChapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.
Chapter 14 From Randomness to Probability Copyright 2012, 2008, 2005 Pearson Education, Inc. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen,
More information2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.
CS450 Final Review Problems Fall 08 Solutions or worked answers provided Problems -6 are based on the midterm review Identical problems are marked recap] Please consult previous recitations and textbook
More informationCS 246 Review of Proof Techniques and Probability 01/14/19
Note: This document has been adapted from a similar review session for CS224W (Autumn 2018). It was originally compiled by Jessica Su, with minor edits by Jayadev Bhaskaran. 1 Proof techniques Here we
More informationDiscrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20
CS 70 Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 Today we shall discuss a measure of how close a random variable tends to be to its expectation. But first we need to see how to compute
More informationParameter estimation Conditional risk
Parameter estimation Conditional risk Formalizing the problem Specify random variables we care about e.g., Commute Time e.g., Heights of buildings in a city We might then pick a particular distribution
More informationProbability (Devore Chapter Two)
Probability (Devore Chapter Two) 1016-345-01: Probability and Statistics for Engineers Fall 2012 Contents 0 Administrata 2 0.1 Outline....................................... 3 1 Axiomatic Probability 3
More informationUniversity of Regina. Lecture Notes. Michael Kozdron
University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating
More informationFinite Mathematics : A Business Approach
Finite Mathematics : A Business Approach Dr. Brian Travers and Prof. James Lampes Second Edition Cover Art by Stephanie Oxenford Additional Editing by John Gambino Contents What You Should Already Know
More informationMetric spaces and metrizability
1 Motivation Metric spaces and metrizability By this point in the course, this section should not need much in the way of motivation. From the very beginning, we have talked about R n usual and how relatively
More informationHypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =
Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,
More information1.4 Techniques of Integration
.4 Techniques of Integration Recall the following strategy for evaluating definite integrals, which arose from the Fundamental Theorem of Calculus (see Section.3). To calculate b a f(x) dx. Find a function
More informationUncertainty. Michael Peters December 27, 2013
Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy
More informationCentral Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom
Central Limit Theorem and the Law of Large Numbers Class 6, 8.5 Jeremy Orloff and Jonathan Bloom Learning Goals. Understand the statement of the law of large numbers. 2. Understand the statement of the
More informationGuide to Proofs on Sets
CS103 Winter 2019 Guide to Proofs on Sets Cynthia Lee Keith Schwarz I would argue that if you have a single guiding principle for how to mathematically reason about sets, it would be this one: All sets
More information6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Tutorial:A Random Number of Coin Flips
6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Tutorial:A Random Number of Coin Flips Hey, everyone. Welcome back. Today, we're going to do another fun problem that
More informationA Brief Review of Probability, Bayesian Statistics, and Information Theory
A Brief Review of Probability, Bayesian Statistics, and Information Theory Brendan Frey Electrical and Computer Engineering University of Toronto frey@psi.toronto.edu http://www.psi.toronto.edu A system
More information1 Review of The Learning Setting
COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #8 Scribe: Changyan Wang February 28, 208 Review of The Learning Setting Last class, we moved beyond the PAC model: in the PAC model we
More informationSTA Module 4 Probability Concepts. Rev.F08 1
STA 2023 Module 4 Probability Concepts Rev.F08 1 Learning Objectives Upon completing this module, you should be able to: 1. Compute probabilities for experiments having equally likely outcomes. 2. Interpret
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationAn introduction to basic information theory. Hampus Wessman
An introduction to basic information theory Hampus Wessman Abstract We give a short and simple introduction to basic information theory, by stripping away all the non-essentials. Theoretical bounds on
More informationGetting Started with Communications Engineering
1 Linear algebra is the algebra of linear equations: the term linear being used in the same sense as in linear functions, such as: which is the equation of a straight line. y ax c (0.1) Of course, if we
More informationDescriptive Statistics (And a little bit on rounding and significant digits)
Descriptive Statistics (And a little bit on rounding and significant digits) Now that we know what our data look like, we d like to be able to describe it numerically. In other words, how can we represent
More informationPlease bring the task to your first physics lesson and hand it to the teacher.
Pre-enrolment task for 2014 entry Physics Why do I need to complete a pre-enrolment task? This bridging pack serves a number of purposes. It gives you practice in some of the important skills you will
More informationMain topics for the First Midterm Exam
Main topics for the First Midterm Exam The final will cover Sections.-.0, 2.-2.5, and 4.. This is roughly the material from first three homeworks and three quizzes, in addition to the lecture on Monday,
More informationCS 124 Math Review Section January 29, 2018
CS 124 Math Review Section CS 124 is more math intensive than most of the introductory courses in the department. You re going to need to be able to do two things: 1. Perform some clever calculations to
More informationMath101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:
Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2: 03 17 08 3 All about lines 3.1 The Rectangular Coordinate System Know how to plot points in the rectangular coordinate system. Know the
More informationMath 52: Course Summary
Math 52: Course Summary Rich Schwartz September 2, 2009 General Information: Math 52 is a first course in linear algebra. It is a transition between the lower level calculus courses and the upper level
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 18
EECS 7 Discrete Mathematics and Probability Theory Spring 214 Anant Sahai Note 18 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω,
More informationReview of Probability. CS1538: Introduction to Simulations
Review of Probability CS1538: Introduction to Simulations Probability and Statistics in Simulation Why do we need probability and statistics in simulation? Needed to validate the simulation model Needed
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1
More informationCS 361: Probability & Statistics
February 26, 2018 CS 361: Probability & Statistics Random variables The discrete uniform distribution If every value of a discrete random variable has the same probability, then its distribution is called
More informationCS280, Spring 2004: Final
CS280, Spring 2004: Final 1. [4 points] Which of the following relations on {0, 1, 2, 3} is an equivalence relation. (If it is, explain why. If it isn t, explain why not.) Just saying Yes or No with no
More informationSection 5.4. Ken Ueda
Section 5.4 Ken Ueda Students seem to think that being graded on a curve is a positive thing. I took lasers 101 at Cornell and got a 92 on the exam. The average was a 93. I ended up with a C on the test.
More informationParametric Models: from data to models
Parametric Models: from data to models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Jan 22, 2018 Recall: Model-based ML DATA MODEL LEARNING MODEL MODEL INFERENCE KNOWLEDGE Learning:
More informationUnderstanding Exponents Eric Rasmusen September 18, 2018
Understanding Exponents Eric Rasmusen September 18, 2018 These notes are rather long, but mathematics often has the perverse feature that if someone writes a long explanation, the reader can read it much
More informationAchilles: Now I know how powerful computers are going to become!
A Sigmoid Dialogue By Anders Sandberg Achilles: Now I know how powerful computers are going to become! Tortoise: How? Achilles: I did curve fitting to Moore s law. I know you are going to object that technological
More informationProbability and Independence Terri Bittner, Ph.D.
Probability and Independence Terri Bittner, Ph.D. The concept of independence is often confusing for students. This brief paper will cover the basics, and will explain the difference between independent
More informationFigure 1: Doing work on a block by pushing it across the floor.
Work Let s imagine I have a block which I m pushing across the floor, shown in Figure 1. If I m moving the block at constant velocity, then I know that I have to apply a force to compensate the effects
More information( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of
Factoring Review for Algebra II The saddest thing about not doing well in Algebra II is that almost any math teacher can tell you going into it what s going to trip you up. One of the first things they
More information1 Impact Evaluation: Randomized Controlled Trial (RCT)
Introductory Applied Econometrics EEP/IAS 118 Fall 2013 Daley Kutzman Section #12 11-20-13 Warm-Up Consider the two panel data regressions below, where i indexes individuals and t indexes time in months:
More informationWe're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).
Sampling distributions and estimation. 1) A brief review of distributions: We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation,
More informationWhy maximize entropy?
Why maximize entropy? TSILB Version 1.0, 19 May 1982 It is commonly accepted that if one is asked to select a distribution satisfying a bunch of constraints, and if these constraints do not determine a
More informationSequence convergence, the weak T-axioms, and first countability
Sequence convergence, the weak T-axioms, and first countability 1 Motivation Up to now we have been mentioning the notion of sequence convergence without actually defining it. So in this section we will
More informationMATH MW Elementary Probability Course Notes Part I: Models and Counting
MATH 2030 3.00MW Elementary Probability Course Notes Part I: Models and Counting Tom Salisbury salt@yorku.ca York University Winter 2010 Introduction [Jan 5] Probability: the mathematics used for Statistics
More informationSTAT2201. Analysis of Engineering & Scientific Data. Unit 3
STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random
More informationProblems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.
Math 224 Fall 2017 Homework 1 Drew Armstrong Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman. Section 1.1, Exercises 4,5,6,7,9,12. Solutions to Book Problems.
More informationTheFourierTransformAndItsApplications-Lecture28
TheFourierTransformAndItsApplications-Lecture28 Instructor (Brad Osgood):All right. Let me remind you of the exam information as I said last time. I also sent out an announcement to the class this morning
More informationExpectation is linear. So far we saw that E(X + Y ) = E(X) + E(Y ). Let α R. Then,
Expectation is linear So far we saw that E(X + Y ) = E(X) + E(Y ). Let α R. Then, E(αX) = ω = ω (αx)(ω) Pr(ω) αx(ω) Pr(ω) = α ω X(ω) Pr(ω) = αe(x). Corollary. For α, β R, E(αX + βy ) = αe(x) + βe(y ).
More informationMITOCW watch?v=vjzv6wjttnc
MITOCW watch?v=vjzv6wjttnc PROFESSOR: We just saw some random variables come up in the bigger number game. And we're going to be talking now about random variables, just formally what they are and their
More informationPart 3: Parametric Models
Part 3: Parametric Models Matthew Sperrin and Juhyun Park August 19, 2008 1 Introduction There are three main objectives to this section: 1. To introduce the concepts of probability and random variables.
More informationMATH2206 Prob Stat/20.Jan Weekly Review 1-2
MATH2206 Prob Stat/20.Jan.2017 Weekly Review 1-2 This week I explained the idea behind the formula of the well-known statistic standard deviation so that it is clear now why it is a measure of dispersion
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Introduction to Probability and Statistics Lecture 10: Expectation and Variance Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/ psarkar/teaching
More informationMI 4 Mathematical Induction Name. Mathematical Induction
Mathematical Induction It turns out that the most efficient solution to the Towers of Hanoi problem with n disks takes n 1 moves. If this isn t the formula you determined, make sure to check your data
More informationSolution to Proof Questions from September 1st
Solution to Proof Questions from September 1st Olena Bormashenko September 4, 2011 What is a proof? A proof is an airtight logical argument that proves a certain statement in general. In a sense, it s
More informationStatistical Inference, Populations and Samples
Chapter 3 Statistical Inference, Populations and Samples Contents 3.1 Introduction................................... 2 3.2 What is statistical inference?.......................... 2 3.2.1 Examples of
More informationWhy should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4.
I. Probability basics (Sections 4.1 and 4.2) Flip a fair (probability of HEADS is 1/2) coin ten times. What is the probability of getting exactly 5 HEADS? What is the probability of getting exactly 10
More informationToss 1. Fig.1. 2 Heads 2 Tails Heads/Tails (H, H) (T, T) (H, T) Fig.2
1 Basic Probabilities The probabilities that we ll be learning about build from the set theory that we learned last class, only this time, the sets are specifically sets of events. What are events? Roughly,
More informationChapter 2. Mathematical Reasoning. 2.1 Mathematical Models
Contents Mathematical Reasoning 3.1 Mathematical Models........................... 3. Mathematical Proof............................ 4..1 Structure of Proofs........................ 4.. Direct Method..........................
More informationInferring information about models from samples
Contents Inferring information about models from samples. Drawing Samples from a Probability Distribution............. Simple Samples from Matlab.................. 3.. Rejection Sampling........................
More informationNote that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).
Confidence Intervals 1) What are confidence intervals? Simply, an interval for which we have a certain confidence. For example, we are 90% certain that an interval contains the true value of something
More informationMAT Mathematics in Today's World
MAT 1000 Mathematics in Today's World Last Time We discussed the four rules that govern probabilities: 1. Probabilities are numbers between 0 and 1 2. The probability an event does not occur is 1 minus
More information