Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Similar documents
Common Large/Small Sample Tests 1/55

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Final Examination Solutions 17/6/2010

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Chapter 1 (Definitions)

1 Inferential Methods for Correlation and Regression Analysis

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

CHAPTER SUMMARIES MAT102 Dr J Lubowsky Page 1 of 13 Chapter 1: Introduction to Statistics

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Sample Size Determination (Two or More Samples)

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

Sampling Distributions, Z-Tests, Power

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

Chapter 2 Descriptive Statistics

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Properties and Hypothesis Testing

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Chapter 8: Estimating with Confidence

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Topic 9: Sampling Distributions of Estimators

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Topic 9: Sampling Distributions of Estimators

(7 One- and Two-Sample Estimation Problem )

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

Read through these prior to coming to the test and follow them when you take your test.

Topic 9: Sampling Distributions of Estimators

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Stat 139 Homework 7 Solutions, Fall 2015

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Mathematical Notation Math Introduction to Applied Statistics

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

MA238 Assignment 4 Solutions (part a)

Chapter 22: What is a Test of Significance?

Solution to selected problems in midterm exam in principal of statistics PREPARED BY Dr. Nafez M. Barakat Islamic university of Gaza

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Statistical Inference About Means and Proportions With Two Populations

Correlation Regression

Parameter, Statistic and Random Samples

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Chapter 13, Part A Analysis of Variance and Experimental Design

Random Variables, Sampling and Estimation

Statistics Independent (X) you can choose and manipulate. Usually on x-axis

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

Computing Confidence Intervals for Sample Data

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Statistics 511 Additional Materials

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Formulas and Tables for Gerstman

z is the upper tail critical value from the normal distribution

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Data Analysis and Statistical Methods Statistics 651

S160 #12. Sampling Distribution of the Proportion, Part 2. JC Wang. February 25, 2016

S160 #12. Review of Large Sample Result for Sample Proportion

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Frequentist Inference

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Chapter 6 Sampling Distributions

MATH/STAT 352: Lecture 15

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Describing the Relation between Two Variables

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

STAC51: Categorical data Analysis

Statistics 300: Elementary Statistics

Confidence Intervals QMET103

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Chapter 4 - Summarizing Numerical Data

Sampling, Sampling Distribution and Normality

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Part 2: statistics Exam Contents: A. Basic concepts of descriptive statistics. B. Statistics of grouped variables

University of California, Los Angeles Department of Statistics. Hypothesis testing

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Stat 225 Lecture Notes Week 7, Chapter 8 and 11

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

Regression, Inference, and Model Building

STP 226 EXAMPLE EXAM #1

GG313 GEOLOGICAL DATA ANALYSIS

Transcription:

Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for aalysis (Roma letter (x, s, ) are used for sample data) A parameter is a umerical measure that describes a characteristic of a populatio A statistic is a umerical measure that describes a characteristic of a sample rage Class itervals: Width of iterval Arithmetic Mea: X X 1+X 2 + X Media (Positio): +1 Rage X o.of desired class groupigs 2 max X mi Z Score: Z X X Z Outliers > 3.0 or <-3.0 S Measures of Cetral Tedecy: Arithmetic Mea, Media, Mode Quartile (Positio): Q 1 0.25( + 1), Q 2 0.50( + 1), Q 3 0.75( + 1) Iter-Quartile Rage: IQR Q 3 Q 1 Measures of Dispersio: Variace, Stadard Deviatio, Coefficiet of Variatio Covariace tells us oly the directio of associatio Sample coefficiet of correlatio r: r covar where s s x s x & s y S.Dev formula y Numerical Descriptive Measures Reordered data: 3, 4, 7, 9 Variace: firstly fid x 5.75 s 2 i1 (x x )2 Sample Variace 1 [(3 7) 2 + (4 7) 2 + (7 7) 2 + (9 7) 2 ] 5.75 1 [( 4)2 + ( 3) 2 + (0) 2 + (2) 2 ] 4.75 16 + 9 + 0 + 4 29 4.75 4.75 6.10 Stadard deviatio: s s 2 6.1 2.46 Coefficiet of variatio: CV s x 100% 2.46 100% 61.7% 4 Questio 3 Cotiuous Probability Distributio Fid the followig probabilities 1. P(Z < 1.67) 0.0475 Read straight from the table. Note: P(Z<1.846) we ca oly look up z values to two decimal places so roud 1.846 up to 1.85 2. P(Z > 2.78)? 1 P(Z < 2.78)? 1 0.27 0.9973 3.P(0.15 < Z < 1.99)? P(Z < 1.99) 0.9767 P(Z < 0.15) 0.5596 0.9767 0.5596 0.4171 Solve the followig iverse problems for the stadard ormal distributio P(Z > ) 0.01 Look up the Iverse Normal Table P(Z > 2.3263) 0.01 The Iverse table oly gives the Z values for upper-tail areas, but because the ormal distributio is symmetric about zero, we fid the upper-tail Z value, ad the lower-tail Z value that we eed is the same value but egative. Fid the two values of Z (symmetrically distributed aroud the mea) such that the followig statemets are true: P( < Z < ) 0.80 Each tail will have a area of 0.10, so lookig up the Iverse table to get the two Z values: Z LOWER 1.2816 Z UPPER 1.2816 P( 1.2816 < Z < 1.2816) 0.80 Samplig Distributio cot. Numerical data is measured o a atural umerical scale (age) Iferetial Statistics - Drawig coclusios about a populatio Cotiuous Data that ca take o ay real umber (time/legth) based o sample data Discrete - Coutable umber of resposes (caot have 0.5) Frequecy Distributios - summary table i which data are Categorical data ca oly be amed or categorised arraged ito umerically ordered classes or itervals Nomial o order, o respose is cosidered better (geder) Ordered array: sequece of data i rak order Ordial There is a order (very good, good, average) Time Series Data collect through time (Moths sales for May) Descriptive Statistics - Collect, Preset, Characterise data Cross Sectioal Collected for a poit i time (My height today) Sample of 4: (2, 3), (7, 9), (4, 5), (4, 6) x 2 + 7 + 4 + 4 17 4 4 4.25 y 3 + 9 + 5 + 6 23 4 4 5.75 x y (x x ) (y y ) (x x )(y y ) 2 3-2.25-2.75 6.19 7 9 2.75 3.25 8.94 4 5-0.25-0.75 0.19 4 6-0.25 0.25-0.06 (x x )(y y ) 15.26 (x x )(y y ) covariace 15.26 1 4 1 5.09 (Directio) correlatio r covar 5.09 s x s y 2.06 2.5 0.99 (Stregth) Cotiuous Probability Distributio cot. Betwee what two values of Z (symmetrically distributed aroud the mea) will 68.26% of all possible Z values be cotaied? Each tail has a area, α 0.1587 (i.e. (1-0.6826)/2, so if we use the Cumulative Normal Distributio table ad look for the area of 0.1587, we fid that P(Z < -1) 0.1587. Therefore the right tail where Z +1 has the same area. So the two values of Z that we are lookig for are -1 ad +1. i.e. P( -1 < Z < 1) 0.6826 as i the diagram. Usig Iverse Normal table, oly look up a area to two decimal places: 0.16 (i.e. 0.1587 rouded to two decimal places) ad we would coclude that the two values of Z were Z 0.9945 ad Z -0.9945 i.e. P( -0.9945 < Z < 0.9945) 0.68 Samplig Distributio cot. I Iterpretig Correlatio Coefficiet r Iterpretatio r -1 PERFECT egative liear -1 < r -0.7 STRONG egative liear -0.7 < r -0.3 MODERATE egative liear -0.3 < r < 0 WEAK egative liear r 0 No relatioship 0 < r < 0.3 WEAK positive liear 0.3 r < 0.7 MODERATE positive liear 0.7 r < 1 STRONG positive liear 1 PERFECT positive liear Populatio mea μ Sample mea - X Populatio variace - 2 Sample Proportio p Stadard Deviatio S Variace S 2 Questio 4. Samplig Distributio Estimatio cot. / Cofidece Itervals. Estimatio Studet Name: Studet No: Is it for μ? No X 2 ( 1) 2 Yes Is kow? No t X μ 2 S Yes Quatitative Z X μ Qualitative Z p π π(1 π)

Questio 2 Simple Liear Regressio & Probability Probability & Discrete Probability Distributios Probability & Discrete Probability Distributios Biomial Distributio (Questio will provide, x ad % (portio) L Questio 5 Hypothesis testig Hypothesis Testig cot. Two populatio Proportio Example Two Sample (Rejectio regio use iverse ormal table) Pooled-Variace t Test Example Two Sample (Sigma Ukow, Variace Equal, Assume 30mi (Cetral Limit T) F Test Example Two Sample (F table for reject regios) 1.6449 (t0.05, 1998) df 1 + 2 2 1000 + 1000 2 1998 FL 1 Fu 1 1.67 0.599 Fu F 0.025, 99, 71 F 0.025, 60, 60 1.67 Fu* F 0.025, 71, 90 F 0.025, 60, 60 1.67 Aalysis of Variace (ANOVA)

BSB123 Data Aalysis Semester 2 2015 Workshop 8 (Week 10) Estimatio Questio 1 The quality cotrol maager at a light bulb factory eeds to estimate that mea life of a large shipmet of light bulbs. The stadard deviatio is 100 hours. A radom sample of 64 light bulbs idicates a sample mea life of 350 hours. (a) Costruct a 95% cofidece iterval estimate of the populatio mea life of light bulbs i this shipmet. (b) Do you thik that the maufacturer has the right to state that the light bulbs last a average of 400 hours? Explai. The first approach is purely to say it s outside the cofidece iterval. The secod approach is to take that value of 400 covert it to a Z value, so you ca determie the probability that the statemet is correct.

(c) Must you assume that the populatio of light bulb life is ormally distributed? Explai. No because my sample size is >30. Therefore accordig to the CLT (cetral limit theorem) at the very least I will ed up with approximate ormal distributio I other words if we have 30 observatios or more, uder the CLT we have a Normal Questio 2 If X 75, S 24, 36, ad assumig that the populatio is ormally distributed, costruct a 95% cofidece iterval estimate of the populatio mea μ.

Questio 3 A study coducted by the Australia Stock Exchage foud that 46% of 2,405 Australia adults surveyed i 2006 held shares, either directly or idirectly through maaged fuds or self-maaged superauatio fuds (2006 Australia Share Owership Study, ASX). (a) Costruct a 95% cofidece iterval for the proportio of Australia adults who held shares i 2006. Whe dealig with populatios proportios we always use a Z. (b) Iterpret the iterval costructed i (a). As above. I am 95% cofidet that the true proportio of Australia adults who held shares i 2006 is betwee 44 ad 48% (c) To costruct a follow-up study to estimate the populatio proportio of adults who curretly hold shares to withi 0.01 with 95% cofidece, how may adults would you iterview?