DAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM

Similar documents
1 Inferential Methods for Correlation and Regression Analysis

Chapter 1 (Definitions)

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Statistics 20: Final Exam Solutions Summer Session 2007

STP 226 EXAMPLE EXAM #1

Exam 2 Instructions not multiple versions

Regression, Inference, and Model Building

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Final Examination Solutions 17/6/2010

(6) Fundamental Sampling Distribution and Data Discription

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Formulas and Tables for Gerstman

11 Correlation and Regression

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Chapter 23: Inferences About Means

Describing the Relation between Two Variables

Read through these prior to coming to the test and follow them when you take your test.

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Topic 9: Sampling Distributions of Estimators

Correlation Regression

Mathematical Notation Math Introduction to Applied Statistics

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Chapter 6 Sampling Distributions

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Properties and Hypothesis Testing

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Introduction to Probability and Statistics Twelfth Edition

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

CHAPTER SUMMARIES MAT102 Dr J Lubowsky Page 1 of 13 Chapter 1: Introduction to Statistics

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

1 Models for Matched Pairs

Good luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed:

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Chapter 8: Estimating with Confidence

Power and Type II Error

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Module 1 Fundamentals in statistics

Statistical Properties of OLS estimators

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Expectation and Variance of a random variable

(7 One- and Two-Sample Estimation Problem )

1036: Probability & Statistics

Common Large/Small Sample Tests 1/55

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Mathematical Statistics - MS

NAME OF SCHOOL NATIONAL SENIOR CERTIFICATE GRADE 12 MATHEMATICS ALTERNATE PAPER PAPER 2 SEPTEMBER 2016

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

Rule of probability. Let A and B be two events (sets of elementary events). 11. If P (AB) = P (A)P (B), then A and B are independent.

Sample Size Determination (Two or More Samples)

Stat 139 Homework 7 Solutions, Fall 2015

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Statistical Fundamentals and Control Charts

Stat 400: Georgios Fellouris Homework 5 Due: Friday 24 th, 2017

Quick Review of Probability

Simple Linear Regression

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Understanding Dissimilarity Among Samples

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Parameter, Statistic and Random Samples


Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Quick Review of Probability

STP 226 ELEMENTARY STATISTICS

Random Variables, Sampling and Estimation

MATH/STAT 352: Lecture 15

Topic 9: Sampling Distributions of Estimators

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

AP Statistics Review Ch. 8

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Transcription:

DAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM Name: Date: December 24th, 2015 Studet Number: Time: 9:30 12:30 Grade: / 116 Examier: Matthew MARCHANT Istructios: 1. No books or otes are permitted. 2. Oly calculators without text storage ad graphical capability are permitted. 3. Please show all your work clearly. 4. Please justify all your aswers. 5. Cheatig will result i a miimum pealty of zero i your exam grade. 6. Uless otherwise stated, roud your aswer to 2 decimal places. 1

1. [10 marks] The commutig distace was determied for each of 10 employees at Acme Maufacturig. Oe of the employees lives i aother tow ad has a large commutig distace. The 10 distaces were as follows: 5, 10, 7, 15, 10, 12, 8, 120, 20, 18 a. Sketch the dot plot (use employee umber as the x-axis) b. Fid the mea. By how much does the outlier affect the mea? c. What statistic do you expect is more represetative of the populatio variability, the stadard deviatio or the iterquartile mea? Why? a. 140 120 100 80 60 40 20 0 Distace from work 0 2 4 6 8 10 12 b. mea22.6, mea without outlier11.78 c. Iterquartile mea because it is a robust statistic ad is less sesitive to outliers ad we have a outlier which is the value 120 miutes. 2. [5 marks] A quality-cotrol techicia selects 120 assembled parts from a assembly lie ad records the followig iformatio cocerig these part: A: defective or o-defective B: the employee umber of the idividual who assembled the part C: the weight of the part a. What is the populatio? b. What is the sample? c. Give the types of the three variables (ie. Quatitative/Cotiuous). a. All of the parts that have ever bee produced i the factory. b. The 120 parts selected from the assembly lie c. A: qualitative ordial B: qualitative omial C: quatitative cotiuous 3. [15 marks] A pair of fair dice is rolled oce. Let E the evet of a sum of 8 Let F the evet of a product of 15 2

Let G the evet of doubles Let H the evet where a 4 ad a 1 are obtaied a. Draw the Ve diagram for the 4 evets b. Fid: P(E), P(F), P(G) ad P(H) c. Fid: P E F, P E G, P E H, P(E G) d. Are E ad H idepedet? Why? e. Are E ad G idepedet? Why? f. Are E ad G mutually exclusive? Why? g. Are H ad E mutually exclusive? Why? 1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6 3,1 3,2 3,3 3,4 3,5 3,6 4,1 4,2 4,3 4,4 4,5 4,6 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 b. P(E)5/36 P(F)2/361/18 P(G)6/361/6 P(H)2/361/18 c. P(E ad F)2/361/18 P(E ad G)1/36 P(E ad H)0 P(E give G)1/6 d. No, mutually exclusive evets caot be idepedet. e. No, P(E give G) is ot equal to P(E) f. No, P(E ad G) is ot equal to 0. g. Yes, P(E ad H) is equal to 0. 3

4. [10 marks] A compay has 10 idetical machies that produce ails idepedetly. The probability that a machie will break dow is 0.1. Defie a radom variable X to be the umber of machies that will break dow i a day. a. What is the appropriate probability distributio for X? b. Give the expressio for the probability that r machies will break dow. c. Compute the probability that at least 1 machies will break dow. d. What is the expected umber of machies that will break dow? e. What is the variace of the umber of machies that will break dow? a. Biomial b. P(Xr)C(10,r)p^r x q^(10-r) c. P(X>1)1-P(X<1)1-P(X0) 0.65 d. E(X) 1 e. Var(X) 0.9 5. [12 marks] Participats of a study with siusitis received either a atibiotic or a placebo ad were asked at the ed of a 10-day period if their symptoms had improved. The resposes are summarized i the table below: a. Commet o whether or ot we ca make a causal statemet. b. Set up hypotheses (Give Ho ad Ha) to test whether the proportio of patiets who reported sigificat improvemet i symptoms is greater i the treatmet group tha i the cotrol group. c. Assume that the test statistic follows the Normal distributio ad obtai i) the SE ii) the test statistic iii) p-value ad iv) complete the hypothesis test at the 0.05 level of sigificace. a. Yes, it's a experimet b. Ho: p_treat-p_cotrol0 Ha: p_treat-p_cotrol>0 c. p_treat_hat 66/85 0.78 p_cotrol_hat 65/81 0.80 p_treat_hat - p_cotrol_hat -0.026 p_hat_pooled (0.78*85 + 0.80*81)/(85+81) 0.79 SE 0.063 Z*-0.41 p_value(oe sided) : 0.66, Not less tha alpha0.05. Fail to reject Ho at 0.05 L.O.S. 4

6. [12 marks] The followig sample data pertai to the shipmets received by a large firm from three differet vedors. Test at the 0.01 level of sigificace whether the quality level of the items received ad the vedor are idepedet. Number rejected Number imperfect but acceptable Number perfect Vedor A 12 23 89 124 Vedor B 8 12 62 82 Vedor C 21 30 119 170 41 65 270 376 a. State the hypotheses. b. Check the coditios. c. Obtai the test statistic. d. Obtai the p-value ad state the coclusio. a. Ho: Vedor ad quality are idepedet Ha: Vedor ad quality are ot idepedet b. Coditios: 1. Hard to say if we have less tha 10% of populatio, we will assume that we do. 2. Expected values all greater tha 5 3. df greater tha 2. c. Chi-SquareValue: 1.3 d. p_value: >0.3, greater tha 0.01, therefore fail to reject Ho at 0.01 L.O.S. 7. [8 marks] Suppose the probability of havig the disease is 0.001. If a perso has the disease, the probability of a positive test result is 0.90. If a perso does ot have the disease, the probability of a egative test result is 0.95. For a perso selected at radom from the populatio, what is the probability they are ifected give they have tested positive? 0.018 Note: Use the followig otatio: dis: a perso selected at radom has the disease dıs: a perso selected at radom does ot have the disease pos: a perso selected at radom tests positive for the disease pos: a perso selected at radom tests egative for the disease 5

8. [10 marks] Cosider the followig data for the time to commute to work for 10 employees: Time to Employee commute (mi) 1 11.3 2 14.7 3 16.4 4 16.5 5 17 6 19.9 7 22.3 8 23.3 9 26.1 10 26.2 a. Obtai the class width (use 4 classes or bis). b. Add a colum for frequecy. c. Add a colum for relative frequecy. d. Sketch the relative frequecy histogram. a. 3.725 bi b. c. bi upper boudaries freq rel_freq 1 15.3 2 0.2 2 19.025 3 0.3 3 22.75 2 0.2 4 26.475 3 0.3 9. [8 marks] You are iterested i learig about opiios of Dawso studets about a proposed climate chage policy. a. Explai how you would use stratified samplig to obtai sample data. b. Explai how you would use cluster samplig to obtai sample data. c. Which of the two methods, stratified or cluster samplig will take less time? d. Which of the two methods, stratified or cluster samplig is likely to give a more represetative sample? a. Form stratas. Oe possibility is to use the programs studets are erolled i. Perform simple radom samplig withi these strata. b. Form clusters (groupigs that are self similar) ad perform simple radom samplig to select clusters ad the perform simple radom samplig withi these clusters. 6

c. Cluster samplig would take less time as we would sample fewer studets. d. Stratified samplig because we are usig more samples ad ot leavig out groups which is risky. 10. [8 marks] A studet receives emails accordig to a Poisso distributio with a average of 53.5 e-mails every week. a. Calculate the probability that the studet receives exactly 115 e-mails i a 15-day period. b. Calculate the probability that the time i betwee two emails is greater tha 2 hours. a. 0.0372 b. 0.529 11. [6 marks] Idetify the outliers i the scatterplots show below, ad determie if they are ifluetial. Explai your reasoig. a. Certaily ifluetial as the slope has bee strogly affected by the outlier. This is because it a high leverage poit (big horizotal distace from the ceter of the data set). b. Not at all ifluetial because it falls directly o the lie of best fit for the cluster furthest to the left. c. Not ifluetial because small leverage (small horizotal distace from the ceter of the data set). 7

12. [12 marks] Researchers studyig athropometry collected body girth measuremets ad skeletal diameter measuremets, as well as age, weight, height ad geder for 507 physically active idividuals. The scatterplot below shows the relatioship betwee height ad shoulder girth (over deltoid muscles), both measured i cetimeters. The mea shoulder girth is 108.20 cm with a stadard deviatio of 10.37 cm. The mea height is 171.14 cm with a stadard deviatio of 9.41 cm. The coefficiet of liear correlatio betwee height ad shoulder girth is 0.67. c. Write the equatio of the regressio lie for predictig height. d. Iterpret the slope ad the itercept i this cotext. e. Calculate R 2 of the regressio lie for predictig height from shoulder girth, ad iterpret it i the cotext of the applicatio. f. A radomly selected studet from your class has a shoulder girth of 100 cm. Predict the height of this studet usig the model. g. The studet from part (d) is 160 cm tall. Calculate the residual, ad explai what this residual meas. h. A oe year old has a shoulder girth of 56 cm. Would it be appropriate to use this liear model to predict the height of this child? a. y_hat 105.4 + 0.61x b. Whe shoulder girth is 0 the height is approx. 105.4 cm (which does't make physical sese) For a icrease i 1 cm of shoulder girth, the height should icrease by 0.61 cm. 8

c. R^2 0.4, is the reductio i variability by the regressio lie. d. Y_hat(100)166.15 e. E6.15: Vertical distace betwee observatio (160cm) ad predicted value (166.16) f. Y_hat(56)139.41. Greater tha the typical height of a oe-year-old. Applyig the liear model for x-values beyod the x-values i the data set is ot recommeded. Formulas: x i1 x i S 2 1 1 i1 ( x i x) 2 ( A) P ( A) ( S) P ( A B) P(A) + P(B) P(A B) ( ) ( A B) P( A B) P A B P( A B) ( B) P( B) P ( A B) P( A) P( B A) P B A P(A) P A B P(B) x i1 x i ( A) P ( A) ( S) S 2 1 1 i1 P ( A B) P(A) + P(B) P(A B) ( ) ( A B) P( A B) P A B P( A B) ( B) P( B) P ( A B) P( A) P( B A) P B A P(A) P A B P(B) ( x i x ) 2 μ E x xp(x) 566 8 σ : E x μ : x μ : P(x) 566 8 9

P X k C, k p @ q BC@ μ p, σ : pq C(k,x)!!C(N k, x) P(x)! C(N,) µ k N!!!,!!!!σ k(n k)(n ) 2! N 2 (N 1) P X k α k k! e α b ( ) P( a < x < b) f ( x) P a x b! µ E X! ( ) 2! σ 2 E x µ ( ) xf x dx f x λecg8 x 0 0 x < 0 SE σ S x C.I. : poit estimate ± Z * SE Test statistic: Z E X 2 µ2 a dx poit estimate - ull value SE SE x1 s 2 1 + s 2 2 x 2 1 2, SE ˆp p(1 p) SE LM CL N χ 2 k i1 SE : LM + SE : LN L M(PCL M ) + L N(PCL N ) B M B N ( O i E i ) 2, df k 1 E i, ( ) χ 2 ( O ij E ij ) 2 ŷ i b 0 + b 1 x b y b x b 1 s y R i s 0 1 x ij E ij, p L MB M QL N B N B M QB N,, df R 1 C 1 10