Producing data Toward statistical inference. Section 3.3

Similar documents
{ } all possible outcomes of the procedure. There are 8 ways this procedure can happen.

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 2: Random Experiments. Prof. Vince Calhoun

ACMS Statistics for Life Sciences. Chapter 9: Introducing Probability

Chapter5 Probability.

Section 13.3 Probability

Key Concept. Properties. February 23, S6.4_3 Sampling Distributions and Estimators

13.1 The Basics of Probability Theory

CSS 211: Statistical Methods I

Chapter 4 Probability

Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is

Chapter Summary. 7.1 Discrete Probability 7.2 Probability Theory 7.3 Bayes Theorem 7.4 Expected value and Variance

Chapter 1 Axioms of Probability. Wen-Guey Tzeng Computer Science Department National Chiao University

Chapter 1 Axioms of Probability. Wen-Guey Tzeng Computer Science Department National Chiao University

Probability- describes the pattern of chance outcomes

CS 441 Discrete Mathematics for CS Lecture 20. Probabilities. CS 441 Discrete mathematics for CS. Probabilities

Useful for Multiplication Rule: When two events, A and B, are independent, P(A and B) = P(A) P(B).

Chap 4 Probability p227 The probability of any outcome in a random phenomenon is the proportion of times the outcome would occur in a long series of

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

Chapter 5 : Probability. Exercise Sheet. SHilal. 1 P a g e

Lecture 6 Probability

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

MAT Mathematics in Today's World

Chapter 6: Probability The Study of Randomness

Solution: Solution: Solution:

AP Statistics Ch 6 Probability: The Study of Randomness

Lecture Lecture 5

18.600: Lecture 3 What is probability?

Section 7.2 Homework Answers

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Statistic: a that can be from a sample without making use of any unknown. In practice we will use to establish unknown parameters.

Presentation on Theo e ry r y o f P r P o r bab a il i i l t i y

Probability and Probability Distributions. Dr. Mohammed Alahmed

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

3.2 Probability Rules

Lecture 15. DATA 8 Spring Sampling. Slides created by John DeNero and Ani Adhikari

Chapter 7 Wednesday, May 26th

UNIT 5 ~ Probability: What Are the Chances? 1

Outline. Probability. Math 143. Department of Mathematics and Statistics Calvin College. Spring 2010

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

1 Probability Theory. 1.1 Introduction

Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on

Lecture 3 Probability Basics

9. DISCRETE PROBABILITY DISTRIBUTIONS

Sampling Distributions

Notation: X = random variable; x = particular value; P(X = x) denotes probability that X equals the value x.

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias

Announcements. Topics: To Do:

Probability and Sample space

Essentials of Statistics and Probability

4.2 Probability Models

STA Module 4 Probability Concepts. Rev.F08 1

Probably About Probability p <.05. Probability. What Is Probability?

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

AMS7: WEEK 2. CLASS 2

When to use Bayes Rule. Bayes Rule for two choices. Bayes Rule for multiple choices. Will Murray s Probability, VI. Bayes Rule 1. VI.

Lecture 1. Chapter 1. (Part I) Material Covered in This Lecture: Chapter 1, Chapter 2 ( ). 1. What is Statistics?

Advanced Herd Management Probabilities and distributions

The Central Limit Theorem

Dept. of Linguistics, Indiana University Fall 2015

CS 361: Probability & Statistics

Review Basic Probability Concept

Probability Rules. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

STP 226 ELEMENTARY STATISTICS

Chapter 1 (Basic Probability)

Probability. Introduction to Biostatistics

Test 1 Review. Review. Cathy Poliak, Ph.D. Office in Fleming 11c (Department Reveiw of Mathematics University of Houston Exam 1)

Chapter 3: Probability 3.1: Basic Concepts of Probability

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Chapter. Probability

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

Mathematical Probability

1 Probability Distributions

Statistical Theory 1

Let us think of the situation as having a 50 sided fair die; any one number is equally likely to appear.

1 The Basic Counting Principles

Probability Distribution for a normal random variable x:

Year 10 Mathematics Probability Practice Test 1

STAT Chapter 3: Probability

Sociology 6Z03 Topic 10: Probability (Part I)

Random processes. Lecture 17: Probability, Part 1. Probability. Law of large numbers

Topic 5: Probability. 5.4 Combined Events and Conditional Probability Paper 1

STAT 201 Chapter 5. Probability

STT When trying to evaluate the likelihood of random events we are using following wording.

STAT200 Elementary Statistics for applications

ISyE 6739 Test 1 Solutions Summer 2017

Math 140 Introductory Statistics

Işık University Math 230 Exam I Exam Duration : 1 hr 30 min Nov. 12, Last Name : First Name : Student Number : Section :

P (A) = P (B) = P (C) = P (D) =

Lecture notes for probability. Math 124

1. Rolling a six sided die and observing the number on the uppermost face is an experiment with six possible outcomes; 1, 2, 3, 4, 5 and 6.

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya

Discrete Probability

MATH1231 Algebra, 2017 Chapter 9: Probability and Statistics

Econ 325: Introduction to Empirical Economics

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

Basic Concepts of Probability. Section 3.1 Basic Concepts of Probability. Probability Experiments. Chapter 3 Probability

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

BASICS OF PROBABILITY CHAPTER-1 CS6015-LINEAR ALGEBRA AND RANDOM PROCESSES

Binomial random variable

ACM 116: Lecture 2. Agenda. Independence. Bayes rule. Discrete random variables Bernoulli distribution Binomial distribution

Transcription:

Producing data Toward statistical inference Section 3.3

Toward statistical inference Idea: Use sampling to understand statistical inference Statistical inference is when a conclusion about a population is inferred from the characteristics of a sample drawn from it Population Sample

Terminology A parameter is a number that describes a characteristic of a population Ex: p is the proportion with some trait in the population A statistic is a number that describes a characteristic of a sample Ex: is the proportion with the trait in the sample The observed value of a statistic is used to estimate the unobserved value of a parameter Ex: estimates p

Sampling variability Sampling variability is the phenomenon by which repeated implementation of the sampling mechanism produces distinct samples Suppose a statistic is recalculated for each sample under repeated sampling. The distribution of its values is its sampling distribution

Bias of a statistic The bias of a statistic is described by the center of its sampling distribution A statistic is unbiased if the mean of its sampling distribution is the same as the parameter it is intended to estimate Use random sampling to produce unbiased estimates

Variability of a statistic The variability of a statistic is described by the spread of its sampling distribution A margin of error is determined by the variability of a statistic The variability of a statistic will be smaller if it is calculated from a larger sample Variability can be made arbitrarily small with a large enough sample ( but sampling costs money, time, effort, etc.)

Producing data Data ethics Section 3.4

Risks of data production Ethical issues may arise in the production of data, especially when people are involved as subjects Examples of risks to participating subjects: Direct risk to physical health Violations of personal space and privacy Target of deception

Standards of data ethics Oversight by an institutional review board Charged to protect the interest of subjects Participation only after informed consent Inform of the nature of the experiment and risks Consent in writing, if possible Confidentiality of raw data Only release statistical summaries publically

Probability and Sampling Distributions Randomness Chapter 4.1

Randomness and probability Observations of random phenomena: Patterns emerge in the long-run after many repetitions of a chance-happening Short-term patterns are unpredictable Probability attempts to describe the long-term patterns of random phenomena

Long-run probabilities A probability is the proportion of times that some interesting outcome is observed in the long run. First series of tosses Second series

Probability and Sampling Distributions Probability models Chapter 4.2

Probability models A probability model is a mathematical framework for describing random phenomena An assignment of probabilities to a set of outcomes An outcome is a possible value generated by the chance-happening of interest Probability rules are the mathematical laws required for a probability model to make sense

Basic setup of a probability model The sample space, S, is the set of all possible outcomes Represents a single repetition of a chance-happening An event, A, B, C, etc., is a subset of the sample space Represents the occurrence of a certain interesting thing

Relationships between events The compliment, A c, of an event, A, is the set of outcomes that are not in A. Represents the nonoccurrence of a certain interesting thing Events A and B are disjoint if they share no outcomes Represent things that cannot occur simultaneously Disjoint Not disjoint

Probability rules 0 P(A) 1 P(S) = 1 Complement rule: P(A c ) = 1 P(A) Addition rule for disjoint events: If A and B are disjoint then P(A or B) = P(A) + P(B)

Finite probabilities Probability rules simplify when there are a finite number of possible outcomes. Each probability is a number between zero and one The sum of all probabilities is one The probability of an event is the sum of the probabilities of outcomes comprising that event.

Example: equally likely outcomes A couple wants to have three children. Observe the possible sequences of boys (B) and girls (G). S = { BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG } Assign equal probability of 1/8 to each outcome B B G B - G - B - G - BBB BBG BGB BGG A = exactly two girls = { BGG, GBG, GGB } P(A) = P(BGG) + P(GBG) + P(GGB) = 1/8 + 1/8 + 1/8 = 3/8 G B G B - G - B - G - GBB GBG GGB GGG

Example: Benford s Law Empirical probabilities of first digits in financial docs 1 st digit 1 2 3 4 5 6 7 8 9 Probability 0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046 Probability 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 1 2 3 4 5 6 7 8 9 Outcomes Probability histogram P(1 st digit 6) = 0.067 + 0.058 + 0.051 + 0.046 = 0.222

Example: Two die rolls Thirty-six possible die rolls, equal probabilities: P(sum is 5) = 4/36 = 0.111 P(doubles) = 6/36 = 0.167, etc. Note: X = sum is an example of a random variable (more later)

Probabilities of intervals If S is continuum of values then probabilities are assigned using a density curve. No part of a density curve can be negative The total area under the curve must be one The probability P(A) of an event A = { a X b } is the area under the curve between a and b. Random variable

Example: Uniform density curve Probabilities of a random number generator, S = { numbers between 0 and 1 } P(0.3 X 0.7) = 0.7 0.3 = 0.4

Example: General uniform density curve Probabilities of a custom random number generator, S = { numbers between c 1 and c 2 } P(a X b) = (b a) / (c 2 c 1 )

Example: Sum of two random numbers Sum of two numbers from a random number generator, S = { numbers between 0 and 2 } height base P(X > 1.3) = ½ b h = ½ (2 1.3) (2 1.3) = 0.245

Example: Normal curves X = ACT college entrance exam scores Suppose X is N(µ =18.6, σ = 5.9) Probability interpretation: X is the score of a randomly selected student

Probability and Sampling Distributions Random variables Chapter 4.3

Random variables A random variable, X, is an idealization of quantitative data recorded from many repetitions of a chance -happening. B - BBB Example: A couple wants to have three children B B G G - B - BBG BGB X = # girls G - B - BGG GBB S = { 0, 1, 2, 3 } G B G G - B - GBG GGB G - GGG

Probability distribution of a random variable The probability distribution of a random variable is its assignment of probabilities in an underlying probability model. Example: X = # girls among three children P(X = 0) = P(BBB) = 1/8 P(X = 1) = P(BBG) + P(BGB) + P(GBB) = 3/8 P(X = 2) = P(BGG) + P(GBG) + P(GGB) = 3/8 P(X = 3) = P(BBB) = 1/8

Types of random variables A discrete random variable represents data whose possible values can be counted. Probability distribution is given as a table of probabilities A continuous random variable represents data whose values lie on a continuum. Probability distribution is given as a density curve

The continuous probability of an exact value Suppose X is a continuous random variable: Natural probability questions involve events A = { a X b } Events A = { X = a } are nonsensical, P(X = a) = 0 P(a X b) = P(a X < b) = P(a < X b) = P(a < X < b) Caution: For discrete r.v. s, < or matters critically