STATISTICS ASSIGNMENT 2

Size: px
Start display at page:

Download "STATISTICS ASSIGNMENT 2"

Transcription

1 STATISTICS ASSIGNMENT 2 Matteo Sostero June 10, 2010 Introduction The following document is a brief statistical report as part of the second assignment. It covers the issues raised on a dataset of 120 students of the degree course in Economics & Management. The full R code, including the commands producing tables and graphs, is given in the Sostero_Assignment_2.R script. 1 Remark First two Questions assume that observed units are the reference population, whereas the other two assume that observed units are a sample from some population whose properties are inferred from the sample. 1 Question What is the distribution of students according to province? The distribution of students according to province can be derived with a frequency table. Other Padova Treviso Venezia Unknown absolute frequency relative frequency Table 1: Frequency table of provinces in dataset A noteworthy feature of the dataset is that it includes three students for whom the province is unknown. 1.2 Suppose to draw at random (with or without replacement) a sample of students. Describe the probability distribution of the (random) number of students whose residence province is Venezia (VE). What is the corresponding expectation? Let us consider both sampling modalities: in the case of sampling with replacement, the probability distribution of observing k Venetian students in a sample of size is described by the binomial probability function. Since there are 42 Venetians in the population of 120, the probability of picking one of them when sampling with replacement is p = 42/120 = p(x = k) = ( k ) 0.35 k 0.65 k 1 This document was created with R version ( ), on a i386-pc-mingw32. 1

2 When sampling without replacement, however, the probability of picking k Venetian students in a sample of from a population of 120, of which 42 are Venetians and 78 are not is described by the hypergeometric distribution: ( ) ( ) k 120 k p(x = k) = ( ) 120 A graphical representation of these probability function is given by the following graph. Sampling with replacement Sampling without replacement Probability Function Probability Function k Venetians in sample k of Venetians in sample We will restrict our analysis to samples obtained with replacement. The corresponding expectation is: E(X) = µ = np = 0.35 = Consider a (real) sample of students. What is the observed number of students whose residence province is Venezia? What is the observed absolute error with respect to expectation? What is the probability of observing an higher error? We generate a sample of students called sam_ with a pseudorandom number generator: > set.seed( ) > rseq <- sort(sample(1:dim(ems)[1],, replace = T)) > sam_ <- ems[rseq, ] > length(sam_$prov[sam_$prov == "VE"]) [1] 11 The number of of Venetian students in the sample is 11, with an (absolute) error of 0.5 from the expectation. The random sample seems to depict quite accurately the population distribution of Venetians. In order to gauge whether this is due merely to sampling luck, we estimate the probability of obtaining a higher absolute error. Observing a (strictly) higher error, in this case, means computing the probability of observing [0, 9] [12, ] Venetians in the sample. > sum(dbinom(c(0:9, 12:),, 0.35)) [1] There is a probability of around 70.2% of observing a strictly larger error than the one of the abovementioned sample. Notice that, since the binomial distribution is discrete, the probability of observing bigger sampling errors with respect to the expectation is the same for any error ε < 1. It seems, therefore, that this sample is mercifully close to the expectation. 2

3 2 Question Suppose again to draw at random (with replacement) a sample of students. What is the joint distribution of the number of students of the four categories of residence province? What is the expectation vector? Let X ve, X pd, X tv, X ot, X uk be the total number of possible observed students from the provinces of Venice, Padua, Treviso, other provinces and unknown locations, respectively. Each has probability p ve, p pd, p tv, p ot, p uk of success. Let also x ve, x pd, x tv, x ot, x uk be the actual number of observed occurrences in the sample. The probability that X ve = x ve, X pd = x pd, X tv = x tv, X ot = x ot, X uk = x uk is described by the multinomial distribution: ( ) p(x ve = x ve, X pd x pd, X tv = x tv, X ot = x ot, X uk = x uk ) = p xve x ve x pd x tv x ot x ve p x pd pd pxtv p xot ot p x uk uk uk The expectation vector is E(X ve ) E(X tv ) E(X pd ) E(X ot ) = E(X uk ) n p ve n p tv n p pd n p ot n p uk = = = = = Consider a (real) sample of students and the corresponding classification of students according to the four categories. What is the probability of the observed result? We create another sample with replacement and call it sam 2. > set.seed(190793) > rseq_2 <- sort(sample(1:dim(ems)[1],, replace = T)) > sam 2 <- ems[rseq_2, ] The corresponding frequency table is: Other Padova Treviso Venezia Unknown absolute frequency relative frequency Table 2: Frequency table of provinces in sample The probability of observing this result, following the multinomial distribution, is derived from the joint frequency function: ( ) p(11, 6, 5, 11) The exact probability can be computed precisely with R by using the absolute frequencies in the sample (abs_f_s) as a vector of observed occurences and the corresponding relative frequencies in the population (rel_f) as vector of probabilities. 2 > dmultinom(x = abs_f_s, size =, prob = rel_f) [1] The probability of observing the above-mentioned result is about 0.23%. 2 Both vectors were created to build their corresponding frequency tables, cf. the script Assignment_2.R for details. 3

4 3 Question What is the distribution of E&M students according to gender? Once again, we use a frequency table: F M absolute frequency relative frequency Table 3: Frequency table of genders in dataset 3.2 Discuss the statistical hypothesis that in the reference population the frequency of male students is equal to 1/2. If we assume that the E&M dataset is a sample of a larger underlying population, we can describe the gender ratio with a Binomial distribution, where the modality male is counted as a success, with n = 120. We use a binomial test to check the null hypothesis H 0 : p s = 0.5, the alternative hypothesis is H 0 : p s 0.5. > binom.test(c(56, 64), p = 0.5, alternative = "two.sided", conf.level = 0.95) Exact binomial test data: c(56, 64) number of successes = 56, number of trials = 120, p-value = alternative hypothesis: true probability of success is not equal to percent confidence interval: sample estimates: probability of success The p-value of this test is relatively high, so we can be quite confident in keeping the null hypothesis. Furthermore, we notice that 0.5 is well within the confidence interval (0.37, 0.55), which is consistent with the result. 4 Question (Consider only second year students) Compare the performance of male and female students according to the number of (recorded) credits. Let s partition the sample according to gender. > ems2y <- ems[ems$year == "2", c(1, 5)] > Fcr <- ems2y$ncr[ems2y$gen == "F"] > Mcr <- ems2y$ncr[ems2y$gen == "M"] > length(fcr) [1] > length(mcr) [1] 33 4

5 A noteworthy feature of this sorting is that it excludes students for whom the number of credits has not been recorded. Only 63 of the 120 units provide data concerning credits. Although these are quite evenly split between genders, the small size of the sample can undermine the accuracy of our inferences. We compute relevant summary statistics for the two sub-samples and produce a boxplot describing the distribution of credits according to gender. Min. 1st Qu. Median Mean 3rd Qu. Max. Sd. Males Females Table 4: Summary statistics for 2 nd year students by gender > boxplot(ems2y$ncr ~ ems2y$gen, notch = T, horizontal = T) F M One thing that stands out from the comparison is that the medians, and their 95% confidence intervals coincide, while means and standard deviations don t. This sems to indicate that the two distributions are shaped differently. Looking at the actual distribution of units, with a stem-and-leaf plot unveils some other interesting features: > stem(fcr) The decimal point is 1 digit(s) to the right of the > stem(mcr) The decimal point is 1 digit(s) to the right of the 5

6 As far as the distribution of credits of female students is concerned, the data seem to be quite scattered. The pattern does not show a well-behaved unimodality and symmetry of the samples. 4.2 How do you evaluate (statistically)the hypothesis that females obtain better results? In order to check the hypothesis with a t-test, we need to make sure that both samples can be approximated by a normal distribution. > ks.test(mcr, "pnorm", mean(mcr), sd(mcr)) One-sample Kolmogorov-Smirnov test data: Mcr D = , p-value = alternative hypothesis: two-sided > ks.test(fcr, "pnorm", mean(fcr), sd(fcr)) One-sample Kolmogorov-Smirnov test data: Fcr D = , p-value = alternative hypothesis: two-sided The Kolmogorov-Smirnov test for both samples does not reject the hypothesis of normality and allows to use the t-test to check our hypothesis. Since we are only interested in assessing whether female students score more credits than males, we set the t-test accordingly. The null hypothesis is H 0 : µ F cr µ Mcr = 0, the alternative hypothesis is H 0 : µ F cr µ Mcr > 0. > t.test(fcr, Mcr, alternative = "greater", conf.level = 0.95) Welch Two Sample t-test data: Fcr and Mcr t = , df = , p-value = alternative hypothesis: true difference in means is greater than 0 95 percent confidence interval: Inf sample estimates: mean of x mean of y The p-value (6.5%) is relatively low and could allow to reject the null hypothesis. The data, therefore, seem to indicate that female students score more credits than male ones, with the caveat of the restricted sample size. 6

Two sample Hypothesis tests in R.

Two sample Hypothesis tests in R. Example. (Dependent samples) Two sample Hypothesis tests in R. A Calculus professor gives their students a 10 question algebra pretest on the first day of class, and a similar test towards the end of the

More information

EDAMI DATA ANALYSIS I: TOOLS FOR UNIVARIATE ANALYSIS

EDAMI DATA ANALYSIS I: TOOLS FOR UNIVARIATE ANALYSIS EDAMI DATA ANALYSIS I: TOOLS FOR UNIVARIATE ANALYSIS Mario Romanazzi September 30, 2016 1 Introduction We review some basic tools of univariate data analysis. We assume the data to be given in the form

More information

Assignments. Statistics Workshop 1: Introduction to R. Tuesday May 26, Atoms, Vectors and Matrices

Assignments. Statistics Workshop 1: Introduction to R. Tuesday May 26, Atoms, Vectors and Matrices Statistics Workshop 1: Introduction to R. Tuesday May 26, 2009 Assignments Generally speaking, there are three basic forms of assigning data. Case one is the single atom or a single number. Assigning a

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01 An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there

More information

Chapter 11. Hypothesis Testing (II)

Chapter 11. Hypothesis Testing (II) Chapter 11. Hypothesis Testing (II) 11.1 Likelihood Ratio Tests one of the most popular ways of constructing tests when both null and alternative hypotheses are composite (i.e. not a single point). Let

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section: Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 You have until 10:20am to complete this exam. Please remember to put your name,

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Inference for Single Proportions and Means T.Scofield

Inference for Single Proportions and Means T.Scofield Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Independent Samples t tests. Background for Independent Samples t test

Independent Samples t tests. Background for Independent Samples t test Independent Samples t tests Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background for Independent Samples t

More information

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline.

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. data; variables: categorical & quantitative; distributions; bar graphs & pie charts: What Is Statistics?

More information

Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing. Road Map Sampling Distributions, Confidence Intervals & Hypothesis Testing

Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing. Road Map Sampling Distributions, Confidence Intervals & Hypothesis Testing Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing ECO22Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same! Two sample tests (part II): What to do if your data are not distributed normally: Option 1: if your sample size is large enough, don't worry - go ahead and use a t-test (the CLT will take care of non-normal

More information

CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition

CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition Ad Feelders Universiteit Utrecht Department of Information and Computing Sciences Algorithmic Data

More information

Math 1040 Sample Final Examination. Problem Points Score Total 200

Math 1040 Sample Final Examination. Problem Points Score Total 200 Name: Math 1040 Sample Final Examination Relax and good luck! Problem Points Score 1 25 2 25 3 25 4 25 5 25 6 25 7 25 8 25 Total 200 1. (25 points) The systolic blood pressures of 20 elderly patients in

More information

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example

More information

Relax and good luck! STP 231 Example EXAM #2. Instructor: Ela Jackiewicz

Relax and good luck! STP 231 Example EXAM #2. Instructor: Ela Jackiewicz STP 31 Example EXAM # Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

An introduction to biostatistics: part 1

An introduction to biostatistics: part 1 An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE Course Title: Probability and Statistics (MATH 80) Recommended Textbook(s): Number & Type of Questions: Probability and Statistics for Engineers

More information

REVIEW: Midterm Exam. Spring 2012

REVIEW: Midterm Exam. Spring 2012 REVIEW: Midterm Exam Spring 2012 Introduction Important Definitions: - Data - Statistics - A Population - A census - A sample Types of Data Parameter (Describing a characteristic of the Population) Statistic

More information

appstats8.notebook October 11, 2016

appstats8.notebook October 11, 2016 Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus

More information

13.1 Categorical Data and the Multinomial Experiment

13.1 Categorical Data and the Multinomial Experiment Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Example. χ 2 = Continued on the next page. All cells

Example. χ 2 = Continued on the next page. All cells Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

CSCI2244-Randomness and Computation First Exam with Solutions

CSCI2244-Randomness and Computation First Exam with Solutions CSCI2244-Randomness and Computation First Exam with Solutions March 1, 2018 Each part of each problem is worth 5 points. There are actually two parts to Problem 2, since you are asked to compute two probabilities.

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

Lecture 28 Chi-Square Analysis

Lecture 28 Chi-Square Analysis Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Data 1 Assessment Calculator allowed for all questions

Data 1 Assessment Calculator allowed for all questions Foundation Higher Data Assessment Calculator allowed for all questions MATHSWATCH All questions Time for the test: 45 minutes Name: Grade Title of clip Marks Score Percentage Clip 84 D Data collection

More information

FREQUENCY DISTRIBUTIONS AND PERCENTILES

FREQUENCY DISTRIBUTIONS AND PERCENTILES FREQUENCY DISTRIBUTIONS AND PERCENTILES New Statistical Notation Frequency (f): the number of times a score occurs N: sample size Simple Frequency Distributions Raw Scores The scores that we have directly

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

BIOS 6222: Biostatistics II. Outline. Course Presentation. Course Presentation. Review of Basic Concepts. Why Nonparametrics.

BIOS 6222: Biostatistics II. Outline. Course Presentation. Course Presentation. Review of Basic Concepts. Why Nonparametrics. BIOS 6222: Biostatistics II Instructors: Qingzhao Yu Don Mercante Cruz Velasco 1 Outline Course Presentation Review of Basic Concepts Why Nonparametrics The sign test 2 Course Presentation Contents Justification

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population

More information

The Empirical Rule, z-scores, and the Rare Event Approach

The Empirical Rule, z-scores, and the Rare Event Approach Overview The Empirical Rule, z-scores, and the Rare Event Approach Look at Chebyshev s Rule and the Empirical Rule Explore some applications of the Empirical Rule How to calculate and use z-scores Introducing

More information

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6. Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 17 pages including

More information

Annotated Exam of Statistics 6C - Prof. M. Romanazzi

Annotated Exam of Statistics 6C - Prof. M. Romanazzi 1 Università di Venezia - Corso di Laurea Economics & Management Annotated Exam of Statistics 6C - Prof. M. Romanazzi March 17th, 2015 Full Name Matricola Total (nominal) score: 30/30 (2/30 for each question).

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Passing-Bablok Regression for Method Comparison

Passing-Bablok Regression for Method Comparison Chapter 313 Passing-Bablok Regression for Method Comparison Introduction Passing-Bablok regression for method comparison is a robust, nonparametric method for fitting a straight line to two-dimensional

More information

# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance

# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance Practice Final Exam Statistical Methods and Models - Math 410, Fall 2011 December 4, 2011 You may use a calculator, and you may bring in one sheet (8.5 by 11 or A4) of notes. Otherwise closed book. The

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 03 The Chi-Square Distributions Dr. Neal, Spring 009 The chi-square distributions can be used in statistics to analyze the standard deviation of a normally distributed measurement and to test the

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness

More information

Calculate the volume of the sphere. Give your answer correct to two decimal places. (3)

Calculate the volume of the sphere. Give your answer correct to two decimal places. (3) 1. Let m = 6.0 10 3 and n = 2.4 10 5. Express each of the following in the form a 10 k, where 1 a < 10 and k. mn; m. n (Total 4 marks) 2. The volume of a sphere is V =, where S is its surface area. 36π

More information

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 11 (continued) - probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being

More information

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr. Department of Economics Business Statistics Chapter 1 Chi-square test of independence & Analysis of Variance ECON 509 Dr. Mohammad Zainal Chapter Goals After completing this chapter, you should be able

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Analysis of Variance. Contents. 1 Analysis of Variance. 1.1 Review. Anthony Tanbakuchi Department of Mathematics Pima Community College

Analysis of Variance. Contents. 1 Analysis of Variance. 1.1 Review. Anthony Tanbakuchi Department of Mathematics Pima Community College Introductory Statistics Lectures Analysis of Variance 1-Way ANOVA: Many sample test of means Department of Mathematics Pima Community College Redistribution of this material is prohibited without written

More information

Descriptive Statistics Example

Descriptive Statistics Example Descriptive tatistics Example A manufacturer is investigating the operating life of laptop computer batteries. The following data are available. Life (min.) Life (min.) Life (min.) Life (min.) 130 145

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

Binomial random variable

Binomial random variable Binomial random variable Toss a coin with prob p of Heads n times X: # Heads in n tosses X is a Binomial random variable with parameter n,p. X is Bin(n, p) An X that counts the number of successes in many

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If

More information

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math

More information

Unit 4 Probability. Dr Mahmoud Alhussami

Unit 4 Probability. Dr Mahmoud Alhussami Unit 4 Probability Dr Mahmoud Alhussami Probability Probability theory developed from the study of games of chance like dice and cards. A process like flipping a coin, rolling a die or drawing a card from

More information

a) The runner completes his next 1500 meter race in under 4 minutes: <

a) The runner completes his next 1500 meter race in under 4 minutes: < I. Let X be the time it takes a runner to complete a 1500 meter race. It is known that for this specific runner, the random variable X has a normal distribution with mean μ = 250.0 seconds and standard

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

SPSS LAB FILE 1

SPSS LAB FILE  1 SPSS LAB FILE www.mcdtu.wordpress.com 1 www.mcdtu.wordpress.com 2 www.mcdtu.wordpress.com 3 OBJECTIVE 1: Transporation of Data Set to SPSS Editor INPUTS: Files: group1.xlsx, group1.txt PROCEDURE FOLLOWED:

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Psych Jan. 5, 2005

Psych Jan. 5, 2005 Psych 124 1 Wee 1: Introductory Notes on Variables and Probability Distributions (1/5/05) (Reading: Aron & Aron, Chaps. 1, 14, and this Handout.) All handouts are available outside Mija s office. Lecture

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

Data 1 Assessment Calculator allowed for all questions

Data 1 Assessment Calculator allowed for all questions Foundation Higher Data Assessment Calculator allowed for all questions MATHSWATCH All questions Time for the test: 4 minutes Name: MATHSWATCH ANSWERS Grade Title of clip Marks Score Percentage Clip 84

More information

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t t Confidence Interval for Population Mean Comparing z and t Confidence Intervals When neither z nor t Applies

More information

STP 226 EXAMPLE EXAM #3 INSTRUCTOR:

STP 226 EXAMPLE EXAM #3 INSTRUCTOR: STP 226 EXAMPLE EXAM #3 INSTRUCTOR: Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned. Signed Date PRINTED

More information

BINF702 SPRING 2015 Chapter 7 Hypothesis Testing: One-Sample Inference

BINF702 SPRING 2015 Chapter 7 Hypothesis Testing: One-Sample Inference BINF702 SPRING 2015 Chapter 7 Hypothesis Testing: One-Sample Inference BINF702 SPRING 2014 Chapter 7 Hypothesis Testing 1 Section 7.9 One-Sample c 2 Test for the Variance of a Normal Distribution Eq. 7.40

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance Chapter 8 Student Lecture Notes 8-1 Department of Economics Business Statistics Chapter 1 Chi-square test of independence & Analysis of Variance ECON 509 Dr. Mohammad Zainal Chapter Goals After completing

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics February 26, 2018 CS 361: Probability & Statistics Random variables The discrete uniform distribution If every value of a discrete random variable has the same probability, then its distribution is called

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

The Normal Distribution

The Normal Distribution The Mary Lindstrom (Adapted from notes provided by Professor Bret Larget) February 10, 2004 Statistics 371 Last modified: February 11, 2004 The The (AKA Gaussian Distribution) is our first distribution

More information

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. 1 Histograms p53 Spoiled ballots are a real threat to democracy. Below are

More information

Sign test. Josemari Sarasola - Gizapedia. Statistics for Business. Josemari Sarasola - Gizapedia Sign test 1 / 13

Sign test. Josemari Sarasola - Gizapedia. Statistics for Business. Josemari Sarasola - Gizapedia Sign test 1 / 13 Josemari Sarasola - Gizapedia Statistics for Business Josemari Sarasola - Gizapedia 1 / 13 Definition is a non-parametric test, a special case for the binomial test with p = 1/2, with these applications:

More information

Quantitative Methods Chapter 0: Review of Basic Concepts 0.1 Business Applications (II) 0.2 Business Applications (III)

Quantitative Methods Chapter 0: Review of Basic Concepts 0.1 Business Applications (II) 0.2 Business Applications (III) Quantitative Methods Chapter 0: Review of Basic Concepts 0.1 Business Applications (II) 0.1.1 Simple Interest 0.2 Business Applications (III) 0.2.1 Expenses Involved in Buying a Car 0.2.2 Expenses Involved

More information

Introduction to Probability and Statistics Slides 3 Chapter 3

Introduction to Probability and Statistics Slides 3 Chapter 3 Introduction to Probability and Statistics Slides 3 Chapter 3 Ammar M. Sarhan, asarhan@mathstat.dal.ca Department of Mathematics and Statistics, Dalhousie University Fall Semester 2008 Dr. Ammar M. Sarhan

More information

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed

More information

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!!  (preferred!)!! Probability theory and inference statistics Dr. Paola Grosso SNE research group p.grosso@uva.nl paola.grosso@os3.nl (preferred) Roadmap Lecture 1: Monday Sep. 22nd Collecting data Presenting data Descriptive

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.2 with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Statistics and Quantitative Analysis U4320

Statistics and Quantitative Analysis U4320 Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

University of Jordan Fall 2009/2010 Department of Mathematics

University of Jordan Fall 2009/2010 Department of Mathematics handouts Part 1 (Chapter 1 - Chapter 5) University of Jordan Fall 009/010 Department of Mathematics Chapter 1 Introduction to Introduction; Some Basic Concepts Statistics is a science related to making

More information

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one.

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one. Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is

More information