Chapter 2: Simple Random Sampling and a Brief Review of Probability

Size: px
Start display at page:

Download "Chapter 2: Simple Random Sampling and a Brief Review of Probability"

Transcription

1 Chapter 2: Simple Random Sampling and a Brief Review of Probability Forest Before the Trees Chapters 2-6 primarily investigate survey analysis. We begin with the basic analyses: Those that differ according to the sampling method. Those that differ by making use of extra information contained in covariates (Chapter 3). Along the way we ll consider the differences between the basic sampling methods from a design perspective as well. But before diving into these topics, we should look at the big picture first (i.e. what are these basic forms of sample). As previously discussed, a sampling method can be classified into probability sampling or non-probability sampling. Probability Sampling Simple Random Sampling Stratified Sampling Cluster Sampling Complex Sampling Non-Probability Sampling Convenience Sampling Quota Sampling Snowball Sampling Judgement Sampling Meaningful analysis requires a probability sampling method. Only these will allow us to know the sampling distribution of estimators which in turn will allow us to have a grasp over the sampling variability. Thus the methods we will concentrate on are: 1

2 Simple Random Sampling This is the sampling method we are used to. An underlying assumption of any intro stats course is that a random sample is obtained using a simple random sample. A simple random sample (SRS) is the simplest sampling method from a conceptual and analytical perspective. It is not true that a simple random sample is the easiest to carry out. Definition: A SRS is one in which ever sample of size n has equal probability of being selected. Example: A wildlife biologist wants to estimate the number of trees in a forest affected by a new parasite introduced to the area via global warming. Rather than investigate every tree, he divides the map of the forrest into square meters, randomly selects 100 of the squares and then investigates any tree in that meter (or closest to it in the absence of a tree). Every subset of 100 squares is equally likely to be selected. Within a few days the task is done and information obtained - much more realistic than investigating every tree! Stratified Sampling Stratified samples can have many advantages over a SRS, such as increased precision of estimates and (possibly) easier implementation. Definition: Stratified sampling has two stages. In the first, the population is divided into groups. In the second, a SRS is done in each group. Example: The wildlife biologist thinks there will be a difference in infection rate according to altitude. He first divides the forest into three regions according to altitude and then samples from each of the regions. 2

3 Cluster Sampling Cluster Sampling has the major advantage of easiness, but there is a decrease in the precision of the estimates. Definition: Cluster sampling has two or more stages. In two stage sampling, we first sample groups using an SRS and then sample units from the selected groups. Example: The forest that is being studied has three natural subregions separated by rivers. In order to minimize travel time, he randomly selects one of the regions and then randomly selects squares from that region. Why are these probability sampling methods? Randomness is not the only ingredient required. There needs to be order to the randomness. Specifically, we need to know what units are in the population and what probability each one has to be selected. Thus a brief review of probability is in order. Probability Statistical results are entirely built upon the results of probability. Probability of Events Sample Space: is the set of all possible outcomes for a given process, and we ll denoted by Ω. Event: A subset of outcomes from the sample space (Ω) that we are interested in, denoted by capital letters. Events either occur or do not occur Population or Universe: The collection of all units we wish to study which we ll denote by U. 3

4 Axioms and Properties of Probabilities Let P(E) denote the probability that event E occurs. quantity which needs to follow certain rules. Here P (E) is a Kolmogorov Axioms of Probability: 1. The probability of an event is a non-negative real number: P (E) The probability that an elementary event from the sample set will occur is 1: P (Ω) = If E 1, E 2,..., E N is a finite sequence of pairwise disjoint events, then P (E 1 E 2... E N ) = N P (E i) These axioms ensure that probability assignments to events are consistent with our notion of probability, but more thought is required to properly assign the probability of an event. Many results follow from the above axioms. If we add to them the definitions of conditional distributions and independence, we get the following useful results Addition Rule: P(A B) = P(A) + P(B) P(A B). If A and B are disjoint, then A B = and P(A B) = P(A) + P(B). Multiplication Rule: P (A B) = P (B A)P (A) and if A and B are independent P (A B) = P (B)P (A) P(A c ) = 1 P(A), where A c denotes the compliment of event A. If A B, then P(A) P(B) P(A B) = P(B A) and P(A B) = P(B A) Note: If P(A B) = 0, then A and B are disjoint 4

5 The following theorem highlights the importance of counting and is used extensively in sampling. Theorem: Let Ω = {w 1, w 2,..., w N } with N equally likely elementary outcomes, where N is finite. Let E be any event in S. Then number of w s in E P (E) = N To help us with counting we will use the binomial coefficient. Suppose we have n distinct objects and we would like to choose k of them. If we can only choose each object once (no replacement) and if order does not matter [choosing the objects (a, b) is the same as choosing (b, a)], then how many distinct sets of k objects can we select from the n of them? (eg) Select k=6 balls from n=49 OR select 100 square maters from 102 million sqare meters of forest. How many ways can we choose k objects from n distinct objects? ( n k ) = n! (n k)!k!, where k! = k*(k-1)*(k-2)*...*2*1 Exercise: What is the probability of choosing the winning Lotto 6/49 number with a single ticket? Birthday Problem: Suppose there are n people in a room. What is the probability that at least 2 will have the same birthday? One of the keys to probability sampling is that each sample has a known probability of being selected. As such we can find the probability of each unit being selected. In some cases, it is easier to determine the probability of each unit being selected and using this to determine the probability of obtaining a specific sample. We use the following notation: π i = P (i th unit is in the sample) Examples: Consider this class as a population. If we run an SRS without replacement of 10 units, what is the probability that you will be selected? 5

6 If we choose 5 individuals of each gender, what is the probability that you will be selected? What type of sampling method is this? Example: Stats Canada is running a survey of hospital patients. They run a multistage sampling method in which they randomly select 5 provinces. Within these provinces they select 2 cities and two rural areas. Within these they select 3 hospitals or clinics and then randomly select 2 departments. Finally they select 5 patients from these. Alice is in the Vancouver General recovering in the ICU. How can we calculate the probability of her being selected? In the rare event of an odd sampling method which is not as systematic as these for which the list of possible samples is available, simply count the number of samples containing the i th unit and divide by the number of possible samples. Random Variables Probability of events are of limited use to us, but Random Variables are based on these and are themselves the basis for all inferential methods we will consider. Recall that a Random Variable has an expectation and a Variance For a discrete random variable, the Expected Value of X is... E(X) = x xf(x) = x xp (X = x) Variance of X = E( (X µ x ) 2 ) = σ 2 x = x R (x µ x) 2 f(x) Example: A population has the following for elements U ={1, 4, 6, 9}. What is the expected value and variance of the sample average for samples of size two? 6

7 Definitions and Properties For the function g, E[g(X)] = x g(x)p (X = x) If a and b are constants, then E[aX + b] = ae[x] + b If X and Y are independent, then E[XY ] = E[X]E[Y ] V ar(x + Y ) = V ar(x) + V ar(y ) + 2Cov(X, Y ) Question: Is E[X/Y ] = E[X]/E[Y ]? Sampling Distributions Given a population of size N, we use the following notation/equations for the population characteristics. t = N y i = Population Total ρ = 1 N ȳ U = 1 N N y i = Population Mean N y i = Population Proportion and y is dichotomous (0, 1) S 2 = σ 2 y = 1 N 1 N (y i ȳ U ) 2 = Variance of Population values For the special case of a dichotomous RV, σ 2 y = ρ(1 ρ) For each of these population characteristics, we have sample analogs. Suppose we draw a sample of size n from U = y 1, y 2,..., y N where N > n. Here y i is the value of Y (the random variable) for the i th element in the sample S 7

8 p = 1 n t s = ȳ = 1 n n y i = Sample Total n y i = Sample Mean n y i = Sample Proportion and y is dichotomous (0, 1) s 2 = 1 n 1 n (y i ȳ) 2 = Sample Variance For the special case of a dichotomous RV, σ 2 y = ρ(1 ρ) Note: Sample quantities exhibit variability. They are examples of summary statistics; their distributions are called sampling distributions. The way in which we obtain the sample will dictate the sampling distribution (more on this later). We don t know the population characteristics - if we did, we wouldn t sample. Let s use these definitions to motivate estimation techniques. Example: Suppose we have 200 bags of mail addressed to Kris Kringle. We want to know more, so we sample 20 bags. The number of letters in each of the sampled bags: {655, 721, 687, 547, 632, 611, 589, 651, 432, 752, 671, 619, 633, 631, 711, 712, 598, 705, 606, 669} 8

9 Review of Estimators With any point estimate, we d like to know it s properties, otherwise there would be many choices of estimates. Estimator: A function of random variables used to estimate a parameter. It is itself random. Estimate: The realization of an estimator. It is a fixed numerical value. Thus, we cannot say anything about the estimate itself, we can only qualify the method used to obtain it - the estimator. It makes no sense to speak of the variability of an estimate. Unbiased: An estimator is unbiased if its expected value is equal to the characteristic it is trying to estimate. Consistent: An estimator is called consistent if its variance converges to 0 as n tends to MVUE: An estimator is the minimum variance unbiased estimator if it has the smallest variance amongst unbiased estimators. Knowing whether the estimator has these qualities is only half the battle. By itself an estimator is of little value. The standard error allows us to understand how wrong we re likely to be. We call the square-root of the variance of an estimator the Standard Error. Much of this course deals with finding the appropriate standard errors for the various estimators we encounter. Naturally, an estimator is biased if its average value differs with the true parameter: Bias[ˆθ] = E[ˆθ] - θ. The question is: Is unbiased always better? There can be occasions where an biased estimator tends to fall closer to the parameter on average than does the unbiased estimator. The mean squared 9

10 error captures this notion: MSE(ˆθ) = E[(ˆθ θ) 2 ] = V ar(ˆθ) + [Bias(ˆθ)] 2 Unfortunately, we typically can t measure an estimator s bias, so we favor the MVUE. Simple Random Sampling As previously defined, a simple random sample is a sample where every possible sample of size n has equal probability of being sampled. Nevertheless, there are two forms of SRS: with or without replacement. A simple random sample with replacement allows every unit in the population to appear at most once in each sample. Comparatively, a SRSWR allows any unit in the population to appear as many as n times in any sample. To run a sample without replacement. First select a unit randomly such that every unit has equal probability of being selected. Now select the second unit such that all remaining units have equal probability of being selected and so forth. To run a SRSWR, select a unit with probability 1 N, replace and then select anew with each unit having equal probability of being selected. Remarks: It s pretty intuitive that selecting the same unit twice, thrice or more is of little informative use. Sampling without replacement is more efficient and it is the method we will use through out the course. The example below will serve as further motivation to this end. The inclusion probability of each unit is equal, but they differ for both sampling methods. Is having an equal probability of inclusion enough to conclude that the sample is an SRS? 10

11 Since SRS without replacement is better, we ll only consider this type of SRS after this chapter. While SRS is simple conceptually and mathematically, it makes a strong assumption: all units are independent. It can lead to overconfidence. There are many situations where this is not true. For example sampling all individuals in a house hold on political affiliations. If there is structure to the population, there may be a better of sampling which exploits and/or takes into account this structure. Thus, SRS should be used when there is no natural structure in the data or there is very little information available on the population. Example (back to sampling distributions): We ll use a very trivial example to highlight some of the definitions we ve discussed and to lead to further discussions. Consider the following population: U = {y 1 = kitten, y 2 = kitten, y 3 = puppy, y 4 = puppy }. Find the proportion of kittens and use the sampling proportion of the sample proportion to find E[ˆp], V ar[ˆp] and M SE[ˆp]. We ll consider three sampling methods: SRS, SRSWR and Systematic Sampling. 11

12 Equations and Notation Population Characteristics: t = N y i ȳ U = 1 N N y i, here p is a special case (for dichotomous variable) S 2 = 1 N N 1 (y i ȳ U ) 2 Characteristic Estimators and Estimators of their Variance: ȳ = 1 n n y i V ar(ȳ) = ( 1 n N ) S 2 n s 2 = 1 n n 1 (y i ȳ) 2 ˆ V ar(ȳ) = ( ) 1 n s 2 N n ( SE(ȳ) = Vˆar(ȳ) = 1 n ˆ CV (ȳ) = SE(ȳ) ȳ ˆt = Nȳ V ar(ˆt) = N 2 V ar(ȳ) = N (1 2 n N ) Vˆar(ˆt) = N (1 2 n s 2 N n V ar(ˆp) = N n ˆ V ar(ȳ) = p(1 p) N 1 n ( 1 n N ) ˆp(1 ˆp) n 1 N ) s 2 n ) S 2 n 12

13 Dealing with Finite Populations When sampling without replacement, no adjustments are required. We can continue to use the estimation methodology that we are already familiar with. However, when dealing with a SRS without replacement, we need to adjust the variance accordingly. Finite Population Correction Term The finite population correction term (fpc) is used to lower the variance in accordance with how much of the population is found in the sample. For a formal derivation of the fpc see Lohr page 45. Recall that the estimators are all summary statistics and as such follow a sampling distribution. We can intuitively understand that the larger the fraction of the population captured in the sample, the more informative it will be and the smaller the variance will be. Comparing the standard error for various sample sizes, the extreme cases help understand this adjustment intuitively. If we only rely on the sample size to get the variance to reach zero, we ll need to n to tend to. With the fpc, the variance will reach 0 at n = N. Typically, the sample size is much smaller than the population size. The sample size should not be chosen based on the population size. A sample of size 100 is as effective for getting information on a population of size 10,000 or 1,000,000. A rule of thumb: the fpc can be ignored if less than 5% of the population is in the sample Having explored the concepts of the fpc and the equations for SRS sampling, we can make the two following statements: The estimator ȳ is unbiased for the population mean in an SRS. 13

14 The estimator Vˆar(ȳ) = S2 (1 n/n) n Confidence Intervals Confidence Intervals are becoming more and more prominent in statistical analysis (why would we favor a confidence interval over a hypothesis test?). We will use these extensively in this course. The construction of a confidence interval is still the same as usual: ˆθ ± z α/2 SE(θ). In order for this to be valid, we ll need to invoke the Central Limit Theorem. The interpretation of a confidence interval becomes slightly more complicated in the context of a finite population. In order to keep the current interpretation, we ll have to imagine that the population being studied is part of a larger collection of populations - called a superpopulation. Due to the fpc we must make adjustments to the sample size claculation equations: n = z2 α/2 S2 ME 2 + z2 α/2 S2 N Example: Over the last week, 1000 people have visited St.Paul s Hospital due to ailment. We sample 100 from a list of phone numbers using a SRS and ask: How long did you wait before being attended to by a doctor? Miraculously, everyone responds leading to the following summary statistics: ȳ = 2.8 hours and s = 0.5 hours. Construct a confidence interval for ȳ and ˆt. How large a sample size would be required for a second study if we want the margin of error for total to be at most 50? 14

15 Systematic Sampling In this sampling scheme we choose every k th element of the frame with a random start on one of the first k units. Here k = N n. Is this SRS? It can be used as a substitute for SRS if the sampling frame is randomized or if there is no sampling frame. Example: A video store clerk wants to know how many videos customers rent on average every month. She samples every 15 th customer starting with the 5 th. Care should be taken when using a systematic sampling scheme. The nature of the list can either mean that systematic sampling will be worse, equivalent to or (in trivial cases) better than an SRS. 15

You are allowed 3? sheets of notes and a calculator.

You are allowed 3? sheets of notes and a calculator. Exam 1 is Wed Sept You are allowed 3? sheets of notes and a calculator The exam covers survey sampling umbers refer to types of problems on exam A population is the entire set of (potential) measurements

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Chapter 3: Element sampling design: Part 1

Chapter 3: Element sampling design: Part 1 Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part

More information

MATH 3510: PROBABILITY AND STATS June 15, 2011 MIDTERM EXAM

MATH 3510: PROBABILITY AND STATS June 15, 2011 MIDTERM EXAM MATH 3510: PROBABILITY AND STATS June 15, 2011 MIDTERM EXAM YOUR NAME: KEY: Answers in Blue Show all your work. Answers out of the blue and without any supporting work may receive no credit even if they

More information

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean Interval estimation October 3, 2018 STAT 151 Class 7 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0

More information

Now we will define some common sampling plans and discuss their strengths and limitations.

Now we will define some common sampling plans and discuss their strengths and limitations. Now we will define some common sampling plans and discuss their strengths and limitations. 1 For volunteer samples individuals are self selected. Participants decide to include themselves in the study.

More information

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek Bhrushundi

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E OBJECTIVE COURSE Understand the concept of population and sampling in the research. Identify the type

More information

STA304H1F/1003HF Summer 2015: Lecture 11

STA304H1F/1003HF Summer 2015: Lecture 11 STA304H1F/1003HF Summer 2015: Lecture 11 You should know... What is one-stage vs two-stage cluster sampling? What are primary and secondary sampling units? What are the two types of estimation in cluster

More information

MATH MW Elementary Probability Course Notes Part I: Models and Counting

MATH MW Elementary Probability Course Notes Part I: Models and Counting MATH 2030 3.00MW Elementary Probability Course Notes Part I: Models and Counting Tom Salisbury salt@yorku.ca York University Winter 2010 Introduction [Jan 5] Probability: the mathematics used for Statistics

More information

Producing Data/Data Collection

Producing Data/Data Collection Producing Data/Data Collection Without serious care/thought here, all is lost... no amount of clever postprocessing of useless data will make it informative. GIGO Chapter 3 of MMD&S is an elementary discussion

More information

Producing Data/Data Collection

Producing Data/Data Collection Producing Data/Data Collection Without serious care/thought here, all is lost... no amount of clever postprocessing of useless data will make it informative. GIGO Chapter 3 of MMD&S is an elementary discussion

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

ST 371 (IX): Theories of Sampling Distributions

ST 371 (IX): Theories of Sampling Distributions ST 371 (IX): Theories of Sampling Distributions 1 Sample, Population, Parameter and Statistic The major use of inferential statistics is to use information from a sample to infer characteristics about

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing 1. Purpose of statistical inference Statistical inference provides a means of generalizing

More information

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH Awanis Ku Ishak, PhD SBM Sampling The process of selecting a number of individuals for a study in such a way that the individuals represent the larger

More information

Discrete Distributions

Discrete Distributions Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

Probability. Hosung Sohn

Probability. Hosung Sohn Probability Hosung Sohn Department of Public Administration and International Affairs Maxwell School of Citizenship and Public Affairs Syracuse University Lecture Slide 4-3 (October 8, 2015) 1/ 43 Table

More information

Chapter 15. Probability Rules! Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 15. Probability Rules! Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 15 Probability Rules! Copyright 2012, 2008, 2005 Pearson Education, Inc. The General Addition Rule When two events A and B are disjoint, we can use the addition rule for disjoint events from Chapter

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

Lecture 5: Sampling Methods

Lecture 5: Sampling Methods Lecture 5: Sampling Methods What is sampling? Is the process of selecting part of a larger group of participants with the intent of generalizing the results from the smaller group, called the sample, to

More information

Sampling. Module II Chapter 3

Sampling. Module II Chapter 3 Sampling Module II Chapter 3 Topics Introduction Terms in Sampling Techniques of Sampling Essentials of Good Sampling Introduction In research terms a sample is a group of people, objects, or items that

More information

Brief Review of Probability

Brief Review of Probability Brief Review of Probability Nuno Vasconcelos (Ken Kreutz-Delgado) ECE Department, UCSD Probability Probability theory is a mathematical language to deal with processes or experiments that are non-deterministic

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Module 9: Sampling IPDET. Sampling. Intro Concepts Types Confidence/ Precision? How Large? Intervention or Policy. Evaluation Questions

Module 9: Sampling IPDET. Sampling. Intro Concepts Types Confidence/ Precision? How Large? Intervention or Policy. Evaluation Questions IPDET Module 9: Sampling Sampling Intervention or Policy Evaluation Questions Design Approaches Data Collection Intro Concepts Types Confidence/ Precision? How Large? Introduction Introduction to Sampling

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

CS 543 Page 1 John E. Boon, Jr.

CS 543 Page 1 John E. Boon, Jr. CS 543 Machine Learning Spring 2010 Lecture 05 Evaluating Hypotheses I. Overview A. Given observed accuracy of a hypothesis over a limited sample of data, how well does this estimate its accuracy over

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Estadística I Exercises Chapter 4 Academic year 2015/16

Estadística I Exercises Chapter 4 Academic year 2015/16 Estadística I Exercises Chapter 4 Academic year 2015/16 1. An urn contains 15 balls numbered from 2 to 16. One ball is drawn at random and its number is reported. (a) Define the following events by listing

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro CS37300 Class Notes Jennifer Neville, Sebastian Moreno, Bruno Ribeiro 2 Background on Probability and Statistics These are basic definitions, concepts, and equations that should have been covered in your

More information

Week 2: Review of probability and statistics

Week 2: Review of probability and statistics Week 2: Review of probability and statistics Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED

More information

Some Concepts of Probability (Review) Volker Tresp Summer 2018

Some Concepts of Probability (Review) Volker Tresp Summer 2018 Some Concepts of Probability (Review) Volker Tresp Summer 2018 1 Definition There are different way to define what a probability stands for Mathematically, the most rigorous definition is based on Kolmogorov

More information

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

Representativeness. Sampling and. Department of Government London School of Economics and Political Science

Representativeness. Sampling and. Department of Government London School of Economics and Political Science Sampling and Representativeness Department of Government London School of Economics and Political Science 1 Representativeness 2 1 Representativeness 2 Case selection Our ambitions about what kind of inferences

More information

6 CARDINALITY OF SETS

6 CARDINALITY OF SETS 6 CARDINALITY OF SETS MATH10111 - Foundations of Pure Mathematics We all have an idea of what it means to count a finite collection of objects, but we must be careful to define rigorously what it means

More information

To understand and analyze this test, we need to have the right model for the events. We need to identify an event and its probability.

To understand and analyze this test, we need to have the right model for the events. We need to identify an event and its probability. Probabilistic Models Example #1 A production lot of 10,000 parts is tested for defects. It is expected that a defective part occurs once in every 1,000 parts. A sample of 500 is tested, with 2 defective

More information

Final Review: Problem Solving Strategies for Stat 430

Final Review: Problem Solving Strategies for Stat 430 Final Review: Problem Solving Strategies for Stat 430 Hyunseung Kang December 14, 011 This document covers the material from the last 1/3 of the class. It s not comprehensive nor is it complete (because

More information

7.1 What is it and why should we care?

7.1 What is it and why should we care? Chapter 7 Probability In this section, we go over some simple concepts from probability theory. We integrate these with ideas from formal language theory in the next chapter. 7.1 What is it and why should

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

THE SAMPLING DISTRIBUTION OF THE MEAN

THE SAMPLING DISTRIBUTION OF THE MEAN THE SAMPLING DISTRIBUTION OF THE MEAN COGS 14B JANUARY 26, 2017 TODAY Sampling Distributions Sampling Distribution of the Mean Central Limit Theorem INFERENTIAL STATISTICS Inferential statistics: allows

More information

Gov Multiple Random Variables

Gov Multiple Random Variables Gov 2000-4. Multiple Random Variables Matthew Blackwell September 29, 2015 Where are we? Where are we going? We described a formal way to talk about uncertain outcomes, probability. We ve talked about

More information

Lecture 3: Sizes of Infinity

Lecture 3: Sizes of Infinity Math/CS 20: Intro. to Math Professor: Padraic Bartlett Lecture 3: Sizes of Infinity Week 2 UCSB 204 Sizes of Infinity On one hand, we know that the real numbers contain more elements than the rational

More information

The Central Limit Theorem

The Central Limit Theorem The Central Limit Theorem Patrick Breheny September 27 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 31 Kerrich s experiment Introduction 10,000 coin flips Expectation and

More information

Homework 4 Solution, due July 23

Homework 4 Solution, due July 23 Homework 4 Solution, due July 23 Random Variables Problem 1. Let X be the random number on a die: from 1 to. (i) What is the distribution of X? (ii) Calculate EX. (iii) Calculate EX 2. (iv) Calculate Var

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

Class 8 Review Problems solutions, 18.05, Spring 2014

Class 8 Review Problems solutions, 18.05, Spring 2014 Class 8 Review Problems solutions, 8.5, Spring 4 Counting and Probability. (a) Create an arrangement in stages and count the number of possibilities at each stage: ( ) Stage : Choose three of the slots

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Chapter 9: Sampling Distributions

Chapter 9: Sampling Distributions Chapter 9: Sampling Distributions 1 Activity 9A, pp. 486-487 2 We ve just begun a sampling distribution! Strictly speaking, a sampling distribution is: A theoretical distribution of the values of a statistic

More information

Discrete Probability Refresher

Discrete Probability Refresher ECE 1502 Information Theory Discrete Probability Refresher F. R. Kschischang Dept. of Electrical and Computer Engineering University of Toronto January 13, 1999 revised January 11, 2006 Probability theory

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Spring 206 Rao and Walrand Note 6 Random Variables: Distribution and Expectation Example: Coin Flips Recall our setup of a probabilistic experiment as

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating

More information

Examine characteristics of a sample and make inferences about the population

Examine characteristics of a sample and make inferences about the population Chapter 11 Introduction to Inferential Analysis Learning Objectives Understand inferential statistics Explain the difference between a population and a sample Explain the difference between parameter and

More information

Review Counting Principles Theorems Examples. John Venn. Arthur Berg Counting Rules 2/ 21

Review Counting Principles Theorems Examples. John Venn. Arthur Berg Counting Rules 2/ 21 Counting Rules John Venn Arthur Berg Counting Rules 2/ 21 Augustus De Morgan Arthur Berg Counting Rules 3/ 21 Algebraic Laws Let S be a sample space and A, B, C be three events in S. Commutative Laws:

More information

Lesson 8: Why Stay with Whole Numbers?

Lesson 8: Why Stay with Whole Numbers? Student Outcomes Students use function notation, evaluate functions for inputs in their domains, and interpret statements that use function notation in terms of a context. Students create functions that

More information

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2 Probability Probability is the study of uncertain events or outcomes. Games of chance that involve rolling dice or dealing cards are one obvious area of application. However, probability models underlie

More information

Common Discrete Distributions

Common Discrete Distributions Common Discrete Distributions Statistics 104 Autumn 2004 Taken from Statistics 110 Lecture Notes Copyright c 2004 by Mark E. Irwin Common Discrete Distributions There are a wide range of popular discrete

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

STAT 418: Probability and Stochastic Processes

STAT 418: Probability and Stochastic Processes STAT 418: Probability and Stochastic Processes Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

Discrete Structures Prelim 1 Selected problems from past exams

Discrete Structures Prelim 1 Selected problems from past exams Discrete Structures Prelim 1 CS2800 Selected problems from past exams 1. True or false (a) { } = (b) Every set is a subset of its power set (c) A set of n events are mutually independent if all pairs of

More information

What is a parameter? What is a statistic? How is one related to the other?

What is a parameter? What is a statistic? How is one related to the other? Chapter Seven: SAMPLING DISTRIBUTIONS 7.1 Sampling Distributions Read 424 425 What is a parameter? What is a statistic? How is one related to the other? Example: Identify the population, the parameter,

More information

Refresher on Discrete Probability

Refresher on Discrete Probability Refresher on Discrete Probability STAT 27725/CMSC 25400: Machine Learning Shubhendu Trivedi University of Chicago October 2015 Background Things you should have seen before Events, Event Spaces Probability

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Introduction to Statistical Inference Dr. Fatima Sanchez-Cabo f.sanchezcabo@tugraz.at http://www.genome.tugraz.at Institute for Genomics and Bioinformatics, Graz University of Technology, Austria Introduction

More information

Notes on Mathematics Groups

Notes on Mathematics Groups EPGY Singapore Quantum Mechanics: 2007 Notes on Mathematics Groups A group, G, is defined is a set of elements G and a binary operation on G; one of the elements of G has particularly special properties

More information

Please bring the task to your first physics lesson and hand it to the teacher.

Please bring the task to your first physics lesson and hand it to the teacher. Pre-enrolment task for 2014 entry Physics Why do I need to complete a pre-enrolment task? This bridging pack serves a number of purposes. It gives you practice in some of the important skills you will

More information

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) 1. A quick and easy indicator of dispersion is a. Arithmetic mean b. Variance c. Standard deviation

More information

Week 12-13: Discrete Probability

Week 12-13: Discrete Probability Week 12-13: Discrete Probability November 21, 2018 1 Probability Space There are many problems about chances or possibilities, called probability in mathematics. When we roll two dice there are possible

More information

Joint, Conditional, & Marginal Probabilities

Joint, Conditional, & Marginal Probabilities Joint, Conditional, & Marginal Probabilities The three axioms for probability don t discuss how to create probabilities for combined events such as P [A B] or for the likelihood of an event A given that

More information

CS 246 Review of Proof Techniques and Probability 01/14/19

CS 246 Review of Proof Techniques and Probability 01/14/19 Note: This document has been adapted from a similar review session for CS224W (Autumn 2018). It was originally compiled by Jessica Su, with minor edits by Jayadev Bhaskaran. 1 Proof techniques Here we

More information

STAT 414: Introduction to Probability Theory

STAT 414: Introduction to Probability Theory STAT 414: Introduction to Probability Theory Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical Exercises

More information

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015 Fall 2015 Population versus Sample Population: data for every possible relevant case Sample: a subset of cases that is drawn from an underlying population Inference Parameters and Statistics A parameter

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

Chapter 8: An Introduction to Probability and Statistics

Chapter 8: An Introduction to Probability and Statistics Course S3, 200 07 Chapter 8: An Introduction to Probability and Statistics This material is covered in the book: Erwin Kreyszig, Advanced Engineering Mathematics (9th edition) Chapter 24 (not including

More information

1 Basic continuous random variable problems

1 Basic continuous random variable problems Name M362K Final Here are problems concerning material from Chapters 5 and 6. To review the other chapters, look over previous practice sheets for the two exams, previous quizzes, previous homeworks and

More information

Final Exam # 3. Sta 230: Probability. December 16, 2012

Final Exam # 3. Sta 230: Probability. December 16, 2012 Final Exam # 3 Sta 230: Probability December 16, 2012 This is a closed-book exam so do not refer to your notes, the text, or any other books (please put them on the floor). You may use the extra sheets

More information

7 Random samples and sampling distributions

7 Random samples and sampling distributions 7 Random samples and sampling distributions 7.1 Introduction - random samples We will use the term experiment in a very general way to refer to some process, procedure or natural phenomena that produces

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence

More information

Inferential Statistics. Chapter 5

Inferential Statistics. Chapter 5 Inferential Statistics Chapter 5 Keep in Mind! 1) Statistics are useful for figuring out random noise from real effects. 2) Numbers are not absolute, and they can be easily manipulated. 3) Always scrutinize

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya BBM 205 Discrete Mathematics Hacettepe University http://web.cs.hacettepe.edu.tr/ bbm205 Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya Resources: Kenneth Rosen, Discrete

More information

Fundamentals of Probability CE 311S

Fundamentals of Probability CE 311S Fundamentals of Probability CE 311S OUTLINE Review Elementary set theory Probability fundamentals: outcomes, sample spaces, events Outline ELEMENTARY SET THEORY Basic probability concepts can be cast in

More information

7. Be able to prove Rules in Section 7.3, using only the Kolmogorov axioms.

7. Be able to prove Rules in Section 7.3, using only the Kolmogorov axioms. Midterm Review Solutions for MATH 50 Solutions to the proof and example problems are below (in blue). In each of the example problems, the general principle is given in parentheses before the solution.

More information

Statistics II Lesson 1. Inference on one population. Year 2009/10

Statistics II Lesson 1. Inference on one population. Year 2009/10 Statistics II Lesson 1. Inference on one population Year 2009/10 Lesson 1. Inference on one population Contents Introduction to inference Point estimators The estimation of the mean and variance Estimating

More information

Probability 1 (MATH 11300) lecture slides

Probability 1 (MATH 11300) lecture slides Probability 1 (MATH 11300) lecture slides Márton Balázs School of Mathematics University of Bristol Autumn, 2015 December 16, 2015 To know... http://www.maths.bris.ac.uk/ mb13434/prob1/ m.balazs@bristol.ac.uk

More information

Crash Course in Statistics for Neuroscience Center Zurich University of Zurich

Crash Course in Statistics for Neuroscience Center Zurich University of Zurich Crash Course in Statistics for Neuroscience Center Zurich University of Zurich Dr. C.J. Luchsinger 1 Probability Nice to have read: Chapters 1, 2 and 3 in Stahel or Chapters 1 and 2 in Cartoon Guide Further

More information

Problem 1. Problem 2. Problem 3. Problem 4

Problem 1. Problem 2. Problem 3. Problem 4 Problem Let A be the event that the fungus is present, and B the event that the staph-bacteria is present. We have P A = 4, P B = 9, P B A =. We wish to find P AB, to do this we use the multiplication

More information

STAT:5100 (22S:193) Statistical Inference I

STAT:5100 (22S:193) Statistical Inference I STAT:5100 (22S:193) Statistical Inference I Week 3 Luke Tierney University of Iowa Fall 2015 Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 1 Recap Matching problem Generalized

More information

Module 8 Probability

Module 8 Probability Module 8 Probability Probability is an important part of modern mathematics and modern life, since so many things involve randomness. The ClassWiz is helpful for calculating probabilities, especially those

More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

ECON1310 Quantitative Economic and Business Analysis A

ECON1310 Quantitative Economic and Business Analysis A ECON1310 Quantitative Economic and Business Analysis A Topic 1 Descriptive Statistics 1 Main points - Statistics descriptive collecting/presenting data; inferential drawing conclusions from - Data types

More information

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1 Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1

More information