Data Analysis and Statistical Methods Statistics 651

Size: px
Start display at page:

Download "Data Analysis and Statistical Methods Statistics 651"

Transcription

1 Data Analysis and Statistical Methods Statistics (Chapter 2) Lecture 5 (MWF) Probabilities and the rules Suhasini Subba Rao

2 Review of previous lecture We looked at the interquartile range. The concept of a variance and standard deviation was introduced. The standard deviation is the square root of the variance. Like the mean, the standard deviation (which is the square root of the variance) is a measure of spread of the population, and is not random. On the other hand, the sample variance and standard deviation are random. In statistics we have: Populations, Random variables (the measurements we are interested in the population) and Probabilities which are allocated 1

3 to the random variables. In this lecture we will introduce the notion of a probability and the rules that go with it. 2

4 Probability Suppose there is a population of people and I select one person at random. What is the chance (probability) that the selected person s height lies in the interval [5.5, 5.75]feet? How do we calculate this chance? The proportion of heights that lie in [5.5, 5.75] is: Total number of people in population with height in the interval [5.5, 5.75]. Total number of people in population 3

5 Summary: Probabilities and random variables A probability is always positive and is greater than or equal to zero and less than or equal to one (lies in the interval [0, 1]). Random variables and probabilities A random variable (often denoted X) is a variable whose outcomes are random (eg. the height of a randomly selected student or the gender of a random selected person) A random variable can take any one of several different outcomes (this is why it is random), to each outcome we allocate a probability. We now consider some examples to familiarize ourselves with this notation. 4

6 Examples of random variables and their probabilities Y is the (assume for simplicity binary) gender of a randomly chosen in the world population. Then Y = {Male or Female}. We write P (Y =Female) to denote the probability a random chosen person is female (equivalently the proportion of females). X is the height of a randomly chosen person. P (5 < X < 5.25), denotes the probability of that a randomly selected students height is in the interval [5, 5.25] (equivalently the proportion of people between feet). 5

7 Illustration:Data from a 651 class Summary of information from a 651 class. 6

8 Probabilities and random variables: Illustration 1 (i) The measurement of interest may be the height of the randomly chosen person. Therefore X = height of randomly selected person. The set of all possible outcomes is any number in the interval [5, 7]. Therefore, P (5 X 7) = 1. P (5.75 X < 6.25) = P (5.75 X < 6) + P (6 X < 6.25) = It is impossible from this plot to evaluate P (5.9 X < 6.15). (ii) The measurement of interest may be the gender of the randomly chosen person. Therefore Y = gender of randomly chosen person. The set of all possible outcomes is {male, female}. We see that P (Y = Female) = and P (Y = Male) =

9 We recall that data set: Lecture 5 (MWF) Random variables, Probability and conditional probabilities Illustration 2: Binge drinking Male Female Binge Not Binge Subtotals P (Binge) = 3314/17096 = P (Not Binge) = 13782/17076 =

10 Mutually exclusive events Two events (outcomes) are mutually exclusive, if the occurrence of one event excludes the occurrence of the other event. In the height example, we define the events A = [5, 5.5) is an (this is the event that a randomly chosen person s height lies in the interval A = [5, 5.5)) and B = [5.75, 6). A = [5, 5.5) and B = [5.75, 6) are mutually exclusive events. This means if person s height is in A = [5, 5.5), then their height cannot be in B = [5.75, 6]. Similarly, if X is in B = [5.75, 6) then it cannot be in A = [5, 5.5). On the other hand, the events A = [5, 5.5) and C = [5.25, 5.75) are not mutually exclusive. If X = 5.3, then it is in both A and C. 9

11 Let X denote the time of the day a person wakes up. Let A=night time and B=day time, then A and B are mutually exclusive events. In any given day one the person cannot get up at both night and day time. 10

12 Mutually exclusive events and probabilities Calculating probabilities If two events A and B are mutually exclusive, then P (A or B) = P (A) + P (B) (the probability of either A or B arising). Binge Drinking example In the survey above students were asked if they had every drunk in excess. The answer is binary, yes or no. Yes and No are mutually exclusive events. Since P (Yes or No) = P (Yes) + P (No) = 1. We saw from above that P (Binge)= P (Yes) = 3314/17096 = This means P (No) = =

13 Height Example Suppose X is the height of a randomly chosen person. If A = [5, 5.5) and B = [5.75, 6), then P (X in A or X in B) = P (5 X < 5.5 or 5.75 X < 6). Since A and B are mutually exclusive then P (X in A or X in B) = P (5 X < 5.5 or 5.75 X < 6) = P (5 X < 5.5) + P (5.75 X < 6). Plot events A and B on a density plot. 12

14 Example 2 Let X denote the height of a person. Suppose that the probability P (X t) is known for all t. We know that students heights lie within 5 7 feet. Using this information, how can we calculate the probabilities below using P (X t) for any t: (i) P (5.5 X < 6.5). (ii) P (X > 6). (iii) P (5 X < 6 or 6.5 X < 7). (iv) P (5 X 6 or 5.75 X < 7). (v) P (5 X < 6 and 5.75 X < 7). We use this formulation to calculate probabilites using the normal/z tables. 13

15 Solution 2 (i) P (5.5 < X 6.5). 14

16 (ii) P (X > 6) = 1 P (X 6). Lecture 5 (MWF) Random variables, Probability and conditional probabilities 15

17 (iii) P (5 < X 6 or 6.5 < X 7). Lecture 5 (MWF) Random variables, Probability and conditional probabilities 16

18 (iv) P (5 < X 6 or 5.5 < X 7). Lecture 5 (MWF) Random variables, Probability and conditional probabilities 17

19 (v) P (5 < X 6 and 5.5 < X 7). Lecture 5 (MWF) Random variables, Probability and conditional probabilities 18

20 Motivation: Conditional probabilities Is there a relationship between binge drinking (excessive consumption of alcohol) and gender students were surveyed (7180 males and 9916 females). The data is given below. Male Female Binge Not Binge Subtotals In order to analyze this type of data we need to introduce the notion of a conditional probability. This is a probability calculated using not the entire population but a subpopulation. 19

21 Motivation: cont A conditional probability is the probability the probability of an event given that we already have some (possibly partial) information about. Male Female Binge Not Binge Subtotals Proportion of females who were bing drinking is 1684/ %, whereas proportion of males who were binge drinking is 1630/ %. We can write these conditional probabilities as P(binge drink Male) = and P(binge drink Female) = Based on the above information, does additional information about gender change the chance of binge drinking? 20

22 This example alludes to the notion of independence, that we define at the end of this lecture. 21

23 Stents In a randomized trial 451 at risk stroke patients were randomly placed in a stents (medical devise) group and a control group. The data after one year is summarized below. Stroke No event treatment control Subtotals P (stroke stent) = 45/224 = 0.2 P (stroke control) = 28/227 = 0.12 What does the difference suggest. Why should we be careful about drawing conclusions about the population of at risk stroke patients based on just the difference of 0.2 and 0.12? 22

24 Conditional probability: Heights and gender of 18 people Height Gender F M F M M F F M F Height Gender M M F M F M M F F We randomly select a person. Let X denote their height and Y their gender. Calculate (i) P (X 5.5). (ii) P (X 5.5 Y = M). (iii) P (X 5.5 Y = F ). What do we observe? Is there a difference in the distribution of male and female heights. 23

25 Conditional probability Using contingency tables to calculate conditional probabilities is straighforward. However, sometimes the variable on which we condition is not always categorical. In this case we use the formula: P (A B) = P (A and B). P (B) The same formula can also be used for contingency tables. Keep in mind that P (A and B) means that an observation has to be in both sets A and B. For example P (5 X < 6 and 5.75 X < 7) = P (5.75 X < 6). 24

26 Whereas P (A or B) means that an observation can satisfy either one of the conditions. Example 1 (height) Suppose X is the height of a randomly selected person and we know that X lies in the interval [5, 6], then what is the probability X lies in [5, 5.5]? We write this probability as: Number of people whose height is in [5, 5.5] Number of people whose height is in [5, 6] = P (5 X X 6) Answer: Using the formula we calculate the probability. Let A = 5 X 5.5 and B = 5 X 6. Using 25

27 We have P (A B) = P (A and B) P (B) = P (A) P (B) 26

28 P (5 X X 6) = = = We see that P (5 X X 6) > P (5 X 5.5). In this example, additional information about the height increased the chance. 27

29 Additional information will not always increase the chance. For example, P (X > 6 5 X 6) = 0 < P (5 X 5.5). Example 2 (height) Calculate the the probability X is between 5 and 6 feet given that we know that person s height is between feet (P (5 < X 6 5 < X 5.5))? Answer Using the formula we calculate the probability. Let A = 5 X 6 and B = 5 X 5.5. Using We have P (5 < X 6 5 < X 5.5) = P (A B) = P (B) P (B) = 1. 28

30 Independence and conditional probabilities Definition Suppose that we have two events A and B. The events A and B are independent of each other if P (A B) = P (A). This means the event B has no influence what so ever on the chance of event A occurring. Let us return to the example of heights and gender. Are height and gender independent variables? In other words, does information about the gender of a person give additional information about their height. Intuitively this seems to be the case. We showed in a previous example: P (5 X 5.5 Y = female) > P (5 X 5.5 Y = male). Which means that P (5 X 5.5 Y = female) P (5 X 5.5) This implies that gender and height are not independent variables. 29

31 Random variables X and Y are independent if for any outcome of X and and outcome of Y, P (X = event A Y = event B) = P (X = event A). Mutually exclusive events are dependent! Example Let A denote the event a height is in the interval [5, 5.5] and B denote the event a height is in the interval [6, 6.5]. We define the corresponding binary variables, Y =Yes if A is true and Y =No if A is not true the variable Z=Yes if B is true and Z=No if B is not true). Then P (Y = True Z=True) = P (X in [5, 5.5] X in [6, 6.5]) = 0 P (X in [5, 5.5]) = P (Y = True). A and B are not independent events. In other words, information about Z gives us additional information about Y. 30

32 Application: Binge drinking Male Female Binge Not Binge Subtotals The proportion of binge drinking in females = P(binge drink Female) = The proportion of binge drinking in males = P(binge drink Male) = By definition there is a dependence/association between between gender and binge drinking (at least in this sample). However, we must be careful. We cannot determine a causality. This is a observational study and other factors (associated with the male participants) such as fraternities, may be driving the increase. 31

33 Application: Stents Stroke No event treatment control Subtotals Proportion of treatment group who got a stroke P (stroke treatment) = 45/224 = 0.2. Proportion in control group who got a stroke P (stroke control) = 28/227 = For this example, there is a dependence between treatment and stroke. If we can determine that this difference is statistically significant (cannot be explained by sample variation), then since the study is experiment with the only difference being the treatment it suggests that the use of stents in increasing the risk of a stroke (a causality can be determined). 32

34 Examples: Independent events Define the random variables. X is the season in college station. Y are the number of hair dressers in college station. Z is the temperature in college station. It is unlikely that the number of hair dressers in college station have an influence on the temperature. Therefore P (Z [25, 28] Y ) = P (X [25, 28]). 33

35 On the other hand season does have an unfluence on temperature, therefore P (X [25, 28] X = summer ) P (X [25, 28]). 34

36 Dependence does not imply causality A study was done in the early 90s to see if the mortality rates between left and right handed people were the same. To do this the psychologists collected the death records of 2000 people who died in May, 1990, in Southern California. They rang the families of all these people and asked whether the were left or right handed. They also categorized the people who died into those above 60 and below 60 years old. This is the data they collected. Out of the 2000, 400 were left handed. Out of the 2000, 150 were left handed and died below 60. Out of the 2000, 300 were right handed and died below

37 (a) A person dies below 60? Lecture 5 (MWF) Random variables, Probability and conditional probabilities (b) A person dies below 60 given that they are left handed? (c) Does the data suggest there is an association/dependence between left handedess and early mortality? 36

38 Solution It instructive to summarize the data as a contingency table: Died before 60 Died after 60 totals Left Right Totals (a) The chance a person dies below 60. Calculate the total proportion who died before 60 P (before 60) = (b) A person dies below 60 given that they are left handed. 37

39 Focus on people who are only left handed. There are 400 of these. Of these, 150 died before 60. Therefore the proportion of left handed people in the sample who died before 60 is P (before 60 left handed) = = 3 8 = 37.5% Using the same argument the proportion of right handed people in the sample who died before 60 is P (before 60 right handed) = = 3 16 = 18.75%. (c) As these proportions are different, it suggests there is some sort of dependence/association between lefthandedness and early mortality. 38

40 Why the dependence? Can we conclude from this data that being left handed increases the chance of early mortality? This statement is a causal statement, it asks whether being left handed causes early mortality. This data set cannot answer this question, does not take into account other variables associated with being left handed which are causing the effect you are seeing. The most plausible explanation for the difference in proportions is that the incidence of left-handedness has increased over the past 50 years (because people are no longer forced to be right handed). The unobserved variable is 39

41 Example: Before 1930, no one was left handed, what would the numbers in the table and proportions be? 40

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 5 (MWF) Probabilities and the rules Suhasini Subba Rao Review of previous lecture We looked

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 6 (MWF) Conditional probabilities and associations Suhasini Subba Rao Review of previous lecture

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching/ Suhasini Subba Rao Review In the previous lecture we looked at the statistics of M&Ms. This example illustrates

More information

Sampling. Module II Chapter 3

Sampling. Module II Chapter 3 Sampling Module II Chapter 3 Topics Introduction Terms in Sampling Techniques of Sampling Essentials of Good Sampling Introduction In research terms a sample is a group of people, objects, or items that

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 31 (MWF) Review of test for independence and starting with linear regression Suhasini Subba

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 26 (MWF) Tests and CI based on two proportions Suhasini Subba Rao Comparing proportions in

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28 Separate vs. paired samples Despite the fact that paired samples usually offer

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 9 (MWF) Calculations for the normal distribution Suhasini Subba Rao Evaluating probabilities

More information

Conditional Probability Solutions STAT-UB.0103 Statistics for Business Control and Regression Models

Conditional Probability Solutions STAT-UB.0103 Statistics for Business Control and Regression Models Conditional Probability Solutions STAT-UB.0103 Statistics for Business Control and Regression Models Counting (Review) 1. There are 10 people in a club. How many ways are there to choose the following:

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 26 (MWF) Tests and CI based on two proportions Suhasini Subba Rao Comparing proportions in

More information

Chapter 5 : Probability. Exercise Sheet. SHilal. 1 P a g e

Chapter 5 : Probability. Exercise Sheet. SHilal. 1 P a g e 1 P a g e experiment ( observing / measuring ) outcomes = results sample space = set of all outcomes events = subset of outcomes If we collect all outcomes we are forming a sample space If we collect some

More information

ECON Semester 1 PASS Mock Mid-Semester Exam ANSWERS

ECON Semester 1 PASS Mock Mid-Semester Exam ANSWERS ECON1310 2006 Semester 1 PASS Mock Mid-Semester Exam ANSWERS MULTIPLE CHOICE QUESTIONS 1. Unemployment rates are an example of: a. Cross-sectional, quantitative, continuous data b. Time-series, quantitative,

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Boxplots and standard deviations Suhasini Subba Rao Review of previous lecture In the previous lecture

More information

Lecture 1: Description of Data. Readings: Sections 1.2,

Lecture 1: Description of Data. Readings: Sections 1.2, Lecture 1: Description of Data Readings: Sections 1.,.1-.3 1 Variable Example 1 a. Write two complete and grammatically correct sentences, explaining your primary reason for taking this course and then

More information

Psych 230. Psychological Measurement and Statistics

Psych 230. Psychological Measurement and Statistics Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State

More information

DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 2003 MOCK EXAMINATIONS IOP 201-Q (INDUSTRIAL PSYCHOLOGICAL RESEARCH)

DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 2003 MOCK EXAMINATIONS IOP 201-Q (INDUSTRIAL PSYCHOLOGICAL RESEARCH) DE CHAZAL DU MEE BUSINESS SCHOOL AUGUST 003 MOCK EXAMINATIONS IOP 01-Q (INDUSTRIAL PSYCHOLOGICAL RESEARCH) Time: hours READ THE INSTRUCTIONS BELOW VERY CAREFULLY. Do not open this question paper until

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females

More information

( ) P A B : Probability of A given B. Probability that A happens

( ) P A B : Probability of A given B. Probability that A happens A B A or B One or the other or both occurs At least one of A or B occurs Probability Review A B A and B Both A and B occur ( ) P A B : Probability of A given B. Probability that A happens given that B

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review Our objective: to make confident statements about a parameter (aspect) in

More information

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1 Topic -2 Probability Larson & Farber, Elementary Statistics: Picturing the World, 3e 1 Probability Experiments Experiment : An experiment is an act that can be repeated under given condition. Rolling a

More information

Topic 4 Probability. Terminology. Sample Space and Event

Topic 4 Probability. Terminology. Sample Space and Event Topic 4 Probability The Sample Space is the collection of all possible outcomes Experimental outcome An outcome from a sample space with one characteristic Event May involve two or more outcomes simultaneously

More information

Binomial and Poisson Probability Distributions

Binomial and Poisson Probability Distributions Binomial and Poisson Probability Distributions Esra Akdeniz March 3, 2016 Bernoulli Random Variable Any random variable whose only possible values are 0 or 1 is called a Bernoulli random variable. What

More information

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E OBJECTIVE COURSE Understand the concept of population and sampling in the research. Identify the type

More information

Probability 5-4 The Multiplication Rules and Conditional Probability

Probability 5-4 The Multiplication Rules and Conditional Probability Outline Lecture 8 5-1 Introduction 5-2 Sample Spaces and 5-3 The Addition Rules for 5-4 The Multiplication Rules and Conditional 5-11 Introduction 5-11 Introduction as a general concept can be defined

More information

Probability and Discrete Distributions

Probability and Discrete Distributions AMS 7L LAB #3 Fall, 2007 Objectives: Probability and Discrete Distributions 1. To explore relative frequency and the Law of Large Numbers 2. To practice the basic rules of probability 3. To work with the

More information

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary) Chapter 14 From Randomness to Probability How to measure a likelihood of an event? How likely is it to answer correctly one out of two true-false questions on a quiz? Is it more, less, or equally likely

More information

Topic 2 Probability. Basic probability Conditional probability and independence Bayes rule Basic reliability

Topic 2 Probability. Basic probability Conditional probability and independence Bayes rule Basic reliability Topic 2 Probability Basic probability Conditional probability and independence Bayes rule Basic reliability Random process: a process whose outcome can not be predicted with certainty Examples: rolling

More information

Chapter. Probability

Chapter. Probability Chapter 3 Probability Section 3.1 Basic Concepts of Probability Section 3.1 Objectives Identify the sample space of a probability experiment Identify simple events Use the Fundamental Counting Principle

More information

Probability (special topic)

Probability (special topic) Chapter 2 Probability (special topic) Probability forms a foundation for statistics. You may already be familiar with many aspects of probability, however, formalization of the concepts is new for most.

More information

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups CHAPTER 10 Comparing Two Populations or Groups 10.1 Comparing Two Proportions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Comparing Two Proportions

More information

Nicole Dalzell. July 2, 2014

Nicole Dalzell. July 2, 2014 UNIT 1: INTRODUCTION TO DATA LECTURE 3: EDA (CONT.) AND INTRODUCTION TO STATISTICAL INFERENCE VIA SIMULATION STATISTICS 101 Nicole Dalzell July 2, 2014 Teams and Announcements Team1 = Houdan Sai Cui Huanqi

More information

3/30/2009. Probability Distributions. Binomial distribution. TI-83 Binomial Probability

3/30/2009. Probability Distributions. Binomial distribution. TI-83 Binomial Probability Random variable The outcome of each procedure is determined by chance. Probability Distributions Normal Probability Distribution N Chapter 6 Discrete Random variables takes on a countable number of values

More information

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS

COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS EX 1 Given the following series of data on Gender and Height for 8 patients, fill in two frequency tables one for each Variable, according to the model

More information

Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on

Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on Version A.) No grade disputes now. Will have a chance to

More information

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Lecture 9 Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Univariate categorical data Univariate categorical data are best summarized in a one way frequency table.

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Lecture 8 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. 5.1 Models of Random Behavior Outcome: Result or answer

More information

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart ST2001 2. Presenting & Summarising Data Descriptive Statistics Frequency Distribution, Histogram & Bar Chart Summary of Previous Lecture u A study often involves taking a sample from a population that

More information

Lecture 1 Introduction to Multi-level Models

Lecture 1 Introduction to Multi-level Models Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review of previous lecture We showed if S n were a binomial random variable, where

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

Exercise 1. Exercise 2. Lesson 2 Theoretical Foundations Probabilities Solutions You ip a coin three times.

Exercise 1. Exercise 2. Lesson 2 Theoretical Foundations Probabilities Solutions You ip a coin three times. Lesson 2 Theoretical Foundations Probabilities Solutions monia.ranalli@uniroma3.it Exercise 1 You ip a coin three times. 1. Use a tree diagram to show the possible outcome patterns. How many outcomes are

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

Lesson 19: Understanding Variability When Estimating a Population Proportion

Lesson 19: Understanding Variability When Estimating a Population Proportion Lesson 19: Understanding Variability When Estimating a Population Proportion Student Outcomes Students understand the term sampling variability in the context of estimating a population proportion. Students

More information

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B). Lectures 7-8 jacques@ucsdedu 41 Conditional Probability Let (Ω, F, P ) be a probability space Suppose that we have prior information which leads us to conclude that an event A F occurs Based on this information,

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Lecture 2: Probability and Distributions

Lecture 2: Probability and Distributions Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info

More information

Sets and Set notation. Algebra 2 Unit 8 Notes

Sets and Set notation. Algebra 2 Unit 8 Notes Sets and Set notation Section 11-2 Probability Experimental Probability experimental probability of an event: Theoretical Probability number of time the event occurs P(event) = number of trials Sample

More information

Teaching Research Methods: Resources for HE Social Sciences Practitioners. Sampling

Teaching Research Methods: Resources for HE Social Sciences Practitioners. Sampling Sampling Session Objectives By the end of the session you will be able to: Explain what sampling means in research List the different sampling methods available Have had an introduction to confidence levels

More information

Statistics for Business and Economics

Statistics for Business and Economics Statistics for Business and Economics Basic Probability Learning Objectives In this lecture(s), you learn: Basic probability concepts Conditional probability To use Bayes Theorem to revise probabilities

More information

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878 Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each

More information

Math 10 - Compilation of Sample Exam Questions + Answers

Math 10 - Compilation of Sample Exam Questions + Answers Math 10 - Compilation of Sample Exam Questions + Sample Exam Question 1 We have a population of size N. Let p be the independent probability of a person in the population developing a disease. Answer the

More information

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0. () () a. X is a binomial distribution with n = 000, p = /6 b. The expected value, variance, and standard deviation of X is: E(X) = np = 000 = 000 6 var(x) = np( p) = 000 5 6 666 stdev(x) = np( p) = 000

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny October 29 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/22 Lister s experiment Introduction In the 1860s, Joseph Lister conducted a landmark

More information

2 Chapter 2: Conditional Probability

2 Chapter 2: Conditional Probability STAT 421 Lecture Notes 18 2 Chapter 2: Conditional Probability Consider a sample space S and two events A and B. For example, suppose that the equally likely sample space is S = {0, 1, 2,..., 99} and A

More information

Sections 3.4 and 3.5

Sections 3.4 and 3.5 Sections 3.4 and 3.5 Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 20 Continuous variables So far we ve

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail} Random Experiment In random experiments, the result is unpredictable, unknown prior to its conduct, and can be one of several choices. Examples: The Experiment of tossing a coin (head, tail) The Experiment

More information

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias Recap Announcements Lecture 5: Statistics 101 Mine Çetinkaya-Rundel September 13, 2011 HW1 due TA hours Thursday - Sunday 4pm - 9pm at Old Chem 211A If you added the class last week please make sure to

More information

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan COSC 341 Human Computer Interaction Dr. Bowen Hui University of British Columbia Okanagan 1 Last Class Introduced hypothesis testing Core logic behind it Determining results significance in scenario when:

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 5. Models of Random Behavior Math 40 Introductory Statistics Professor Silvia Fernández Chapter 5 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Outcome: Result or answer

More information

Lecture 4. Selected material from: Ch. 6 Probability

Lecture 4. Selected material from: Ch. 6 Probability Lecture 4 Selected material from: Ch. 6 Probability Example: Music preferences F M Suppose you want to know what types of CD s males and females are more likely to buy. The CD s are classified as Classical,

More information

Econ 113. Lecture Module 2

Econ 113. Lecture Module 2 Econ 113 Lecture Module 2 Contents 1. Experiments and definitions 2. Events and probabilities 3. Assigning probabilities 4. Probability of complements 5. Conditional probability 6. Statistical independence

More information

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22 Announcements Announcements Lecture 1 - Data and Data Summaries Statistics 102 Colin Rundel January 13, 2013 Homework 1 - Out 1/15, due 1/22 Lab 1 - Tomorrow RStudio accounts created this evening Try logging

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

7.1: What is a Sampling Distribution?!?!

7.1: What is a Sampling Distribution?!?! 7.1: What is a Sampling Distribution?!?! Section 7.1 What Is a Sampling Distribution? After this section, you should be able to DISTINGUISH between a parameter and a statistic DEFINE sampling distribution

More information

A proportion is the fraction of individuals having a particular attribute. Can range from 0 to 1!

A proportion is the fraction of individuals having a particular attribute. Can range from 0 to 1! Proportions A proportion is the fraction of individuals having a particular attribute. It is also the probability that an individual randomly sampled from the population will have that attribute Can range

More information

Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information:

Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information: Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information: aanum@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015 2016/2017 Session Overview In this Session

More information

Data Collection: What Is Sampling?

Data Collection: What Is Sampling? Project Planner Data Collection: What Is Sampling? Title: Data Collection: What Is Sampling? Originally Published: 2017 Publishing Company: SAGE Publications, Inc. City: London, United Kingdom ISBN: 9781526408563

More information

Lesson 6 Population & Sampling

Lesson 6 Population & Sampling Lesson 6 Population & Sampling Lecturer: Dr. Emmanuel Adjei Department of Information Studies Contact Information: eadjei@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 4-1 Overview 4-2 Fundamentals 4-3 Addition Rule Chapter 4 Probability 4-4 Multiplication Rule:

More information

STT 315 Problem Set #3

STT 315 Problem Set #3 1. A student is asked to calculate the probability that x = 3.5 when x is chosen from a normal distribution with the following parameters: mean=3, sd=5. To calculate the answer, he uses this command: >

More information

3.2 Probability Rules

3.2 Probability Rules 3.2 Probability Rules The idea of probability rests on the fact that chance behavior is predictable in the long run. In the last section, we used simulation to imitate chance behavior. Do we always need

More information

b. Find P(it will rain tomorrow and there will be an accident). Show your work. c. Find P(there will be an accident tomorrow). Show your work.

b. Find P(it will rain tomorrow and there will be an accident). Show your work. c. Find P(there will be an accident tomorrow). Show your work. Algebra II Probability Test Review Name Date Hour ) A study of traffic patterns in a large city shows that if the weather is rainy, there is a 50% chance of an automobile accident occurring during the

More information

Multiple Choice Practice Set 1

Multiple Choice Practice Set 1 Multiple Choice Practice Set 1 This set of questions covers material from Chapter 1. Multiple choice is the same format as for the midterm. Q1. Two events each have probability 0.2 of occurring and are

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

Introduction to Statistical Data Analysis Lecture 4: Sampling

Introduction to Statistical Data Analysis Lecture 4: Sampling Introduction to Statistical Data Analysis Lecture 4: Sampling James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis 1 / 30 Introduction

More information

STAT 430/510 Probability

STAT 430/510 Probability STAT 430/510 Probability Hui Nie Lecture 3 May 28th, 2009 Review We have discussed counting techniques in Chapter 1. Introduce the concept of the probability of an event. Compute probabilities in certain

More information

Lecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012

Lecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012 Lecture Notes 2 Advanced Topics Econ 2050, Principles of Statistics Kevin R Foster, CCNY Spring 202 Endogenous Independent Variables are Invalid Need to have X causing Y not vice-versa or both! NEVER regress

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

THE SAMPLING DISTRIBUTION OF THE MEAN

THE SAMPLING DISTRIBUTION OF THE MEAN THE SAMPLING DISTRIBUTION OF THE MEAN COGS 14B JANUARY 26, 2017 TODAY Sampling Distributions Sampling Distribution of the Mean Central Limit Theorem INFERENTIAL STATISTICS Inferential statistics: allows

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in

More information

Lecture 6. Probability events. Definition 1. The sample space, S, of a. probability experiment is the collection of all

Lecture 6. Probability events. Definition 1. The sample space, S, of a. probability experiment is the collection of all Lecture 6 1 Lecture 6 Probability events Definition 1. The sample space, S, of a probability experiment is the collection of all possible outcomes of an experiment. One such outcome is called a simple

More information

University of Jordan Fall 2009/2010 Department of Mathematics

University of Jordan Fall 2009/2010 Department of Mathematics handouts Part 1 (Chapter 1 - Chapter 5) University of Jordan Fall 009/010 Department of Mathematics Chapter 1 Introduction to Introduction; Some Basic Concepts Statistics is a science related to making

More information

STAT:5100 (22S:193) Statistical Inference I

STAT:5100 (22S:193) Statistical Inference I STAT:5100 (22S:193) Statistical Inference I Week 3 Luke Tierney University of Iowa Fall 2015 Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 1 Recap Matching problem Generalized

More information

Business Statistics MBA Pokhara University

Business Statistics MBA Pokhara University Business Statistics MBA Pokhara University Chapter 3 Basic Probability Concept and Application Bijay Lal Pradhan, Ph.D. Review I. What s in last lecture? Descriptive Statistics Numerical Measures. Chapter

More information

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability What is Probability? the chance of an event occuring eg 1classical probability 2empirical probability 3subjective probability Section 2 - Probability (1) Probability - Terminology random (probability)

More information

1 Using standard errors when comparing estimated values

1 Using standard errors when comparing estimated values MLPR Assignment Part : General comments Below are comments on some recurring issues I came across when marking the second part of the assignment, which I thought it would help to explain in more detail

More information

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions ACMS 20340 Statistics for Life Sciences Chapter 13: Sampling Distributions Sampling We use information from a sample to infer something about a population. When using random samples and randomized experiments,

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

Statistical Theory 1

Statistical Theory 1 Statistical Theory 1 Set Theory and Probability Paolo Bautista September 12, 2017 Set Theory We start by defining terms in Set Theory which will be used in the following sections. Definition 1 A set is

More information

Probability Review. AP Statistics April 25, Dr. John Holcomb, Cleveland State University

Probability Review. AP Statistics April 25, Dr. John Holcomb, Cleveland State University Probability Review AP Statistics April 25, 2015 Dr. John Holcomb, Cleveland State University PROBLEM 1 The data below comes from a nutritional study conducted at Ohio University. Eighty three subjects

More information