Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

Similar documents
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

1 Inferential Methods for Correlation and Regression Analysis

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Formulas and Tables for Gerstman

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Describing the Relation between Two Variables

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Read through these prior to coming to the test and follow them when you take your test.

Important Concepts not on the AP Statistics Formula Sheet

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Confounding: two variables are confounded when the effects of an RV cannot be distinguished. When describing data: describe center, spread, and shape.

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Chapter 6 Sampling Distributions

Random Variables, Sampling and Estimation

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Module 1 Fundamentals in statistics

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Chapter 1 (Definitions)

Tables and Formulas for Sullivan, Fundamentals of Statistics, 2e Pearson Education, Inc.

Mathematical Notation Math Introduction to Applied Statistics

Final Examination Solutions 17/6/2010

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Lecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Computing Confidence Intervals for Sample Data

Stat 139 Homework 7 Solutions, Fall 2015

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

STAT431 Review. X = n. n )

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Statistics 20: Final Exam Solutions Summer Session 2007

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Topic 9: Sampling Distributions of Estimators

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Median and IQR The median is the value which divides the ordered data values in half.

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Sample Size Determination (Two or More Samples)

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Lecture 7: Non-parametric Comparison of Location. GENOME 560 Doug Fowler, GS

This is an introductory course in Analysis of Variance and Design of Experiments.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Correlation Regression

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Topic 10: Introduction to Estimation

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Chapter 13, Part A Analysis of Variance and Experimental Design

Common Large/Small Sample Tests 1/55

Linear Regression Models

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Regression, Inference, and Model Building

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Expectation and Variance of a random variable

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Sampling Distributions, Z-Tests, Power

Biostatistics for Med Students. Lecture 2

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Statistics 511 Additional Materials

Confidence Intervals for the Population Proportion p

Data Analysis and Statistical Methods Statistics 651

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Lecture 11 Simple Linear Regression

MA238 Assignment 4 Solutions (part a)

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

11 Correlation and Regression

(7 One- and Two-Sample Estimation Problem )

Introducing Sample Proportions

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Chapter 7 Student Lecture Notes 7-1

y ij = µ + α i + ɛ ij,

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Stat 200 -Testing Summary Page 1

5. 추정 (Estimation) 2014/4/17

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Announcements. Unit 5: Inference for Categorical Data Lecture 1: Inference for a single proportion

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

1036: Probability & Statistics

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

(6) Fundamental Sampling Distribution and Data Discription

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Properties and Hypothesis Testing

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Transcription:

Admiistrative Notes s - Lecture 7 Fial review Fial Exam is Tuesday, May 0th (3-5pm Covers Chapters -8 ad 0 i textbook Brig ID cards to fial! Allowed: Calculators, double-sided 8.5 x cheat sheet Exam Rooms: Stat Lecture Last Name Fial Exam Room am pm Everyoe MEYERSON HALL B 3pm Everyoe COHEN HALL G7 April 6, 06 Stat - Lecture 6 - Review April 6, 06 Stat - Lecture 6 - Review Admiistrative Notes Office hours will be held throughout the exam period up util the fial exam o May 0 th List of additioal textbook study problems from secod half of the course will be also be posted o the course website Outlie Collectig Data (Chapter 3 Explorig Data - Oe variable (Chapter Explorig Data - Two variables (Chapter Probability (Chapter 4 Samplig Distributios (Chapter 5 Itroductio to Iferece (Chapter 6 Iferece for Meas (Chapter 7 Iferece for Proportios (Chapter 8 Iferece for Regressio (Chapter 0 Urba Aalytics Case Study April 6, 06 Stat - Lecture 6 - Review 3 April 6, 06 Stat - Lecture 6 - Review 4 Experimets Experimetal Uits Treatmet Group Cotrol Group Treatmet No Treatmet Samplig ad Surveys? Parameter Try to establish the causal effect of a treatmet Key is reducig presece of cofoudig variables Matchig: esure treatmet/cotrol groups are very similar o observed variables eg. race, geder, age Radomizatio: radomly dividig ito treatmet or cotrol leads to groups that are similar o observed ad uobserved cofoudig variables Double-Blidig: both subjects ad evaluators do t kow who is i treatmet group vs. cotrol group April 6, 06 Stat - Lecture 6 - Review 5 Samplig Sample Estimatio Iferece Just like i experimets, we must be cautious of potetial sources of bias i our samplig results Volutary respose samples, udercoverage, orespose, utrue-respose, wordig of questios Simple Radom Samplig: less biased sice each idividual i the populatio has a equal chace of beig icluded i the sample April 6, 06 Stat - Lecture 6 - Review 6

Differet Types of Graphs A distributio describes what values a variable takes ad how frequetly these values occur Boxplots are good for ceter,spread, ad outliers but do t idicate shape of a distributio Histograms much more effective at displayig the shape of a distributio April 6, 06 Stat - Lecture 6 - Review 7 Measures of Ceter ad Spread Ceter: Mea Spread: Stadard Deviatio x i X = = x + x +!+ x s = (x i x For outliers or asymmetry, media/iqr are better Ceter: Media - middle umber i distributio Spread: Iter-Quartile Rage IQR = Q3 - Q We use mea ad SD more sice most distributios are symmetric with o outliers (eg. Normal April 6, 06 Stat - Lecture 6 - Review 8 Relatioships betwee cotiuous var. Scatterplot examies relatioship betwee respose variable (Y ad a explaatory variable (X: Educatio ad Mortality: r = -0.5 Positive vs. egative associatios Correlatio is a measure of the stregth of liear relatioship betwee variables X ad Y r ear or - meas strog liear relatioship r ear 0 meas weak liear relatioship Liear Regressio: come back to later April 6, 06 Stat - Lecture 6 - Review 9 Probability Radom process: outcome ot kow exactly, but have probability distributio of possible outcomes Evet: outcome of radom process with prob. P(A Probability calculatios: combiatios of rules Equally likely outcomes rule Complemet rule Additive rule for disjoit evets Multiplicatio rule for idepedet evets Radom variable: a umerical outcome or summary of a radom process Discrete r.v. has a fiite umber of distict values Cotiuous r.v. has a o-coutable umber of values Liear trasformatios of variables April 6, 06 Stat - Lecture 6 - Review 0 The Normal Distributio The Normal distributio has ceter µ ad spread N(0, N(, N(-, N(0, Iferece usig Samples? Parameters: µ or p Samplig Iferece Have tables for ay probability from the stadard ormal distributio (µ = 0 ad = Stadardizatio: covertig X which has a N(µ, distributio to Z which has a N(0, distributio: Z = X µ Reverse stadardizatio: covertig a stadard ormal Z ito a o-stadard ormal X X = Z + µ April 6, 06 Stat - Lecture 6 - Review Sample Estimatio s: X or p ˆ Cotiuous: pop. mea estimated by sample mea Discrete: pop. proportio estimated by sample proportio Key for iferece: Samplig Distributios Distributio of values take by statistic i all possible samples from the same populatio April 6, 06 Stat - Lecture 6 - Review

Samplig Distributio of Sample Mea The ceter of the samplig distributio of the sample mea is the populatio mea: mea( X = µ Over all samples, the sample mea will, o average, be equal to the populatio mea (o guaratees for sample! The stadard deviatio of the samplig distributio of the sample mea is SD( X = As sample size icreases, stadard deviatio of the sample mea decreases! Cetral Limit Theorem: if the sample size is large eough, the the sample mea X has a approximately Normal distributio Biomial/Normal Dist. For Proportios Sample cout Y follows Biomial distributio which we ca calculate from Biomial tables i small samples If the sample size is large ( p ad (-p 0, sample cout Y follows a Normal distributio: mea(y = p SD(Y = p ( p If the sample size is large, the sample proportio also approximately follows a Normal distributio: mea( p ˆ = p SD( p ˆ = p ( p April 6, 06 Stat - Lecture 6 - Review 3 April 6, 06 Stat - Lecture 6 - Review 4 Summary of Samplig Distributio Type of Data Ukow Parameter Cotiuous µ X Variability of SD( X = Distributio of Normal (if large Itroductio to Iferece Use sample estimate as ceter of a cofidece iterval of likely values for populatio parameter All cofidece itervals have the same form: Estimate ± Margi of Error The margi of error is always some multiple of the stadard deviatio (or stadard error of statistic Cout X i = 0 or p ˆ p SD( p ˆ = p ( p Biomal (if small Normal (if large Hypothesis test: data supports specific hypothesis?. Formulate your Null ad Alterative Hypotheses. Calculate the test statistic: differece betwee data ad your ull hypothesis 3. Fid the p-value for the test statistic: how probable is your data if the ull hypothesis is true? April 6, 06 Stat - Lecture 6 - Review 5 April 6, 06 Stat - Lecture 6 - Review 6 Iferece: Sigle Mea µ Kow SD : cofidece itervals ad test statistics ivolve stadard deviatio ad ormal critical values ' X Z * Ukow SD : cofidece itervals ad test statistics ivolve stadard error ad critical values from a t distributio with - degrees of freedom * X t s (, X + Z * * Z = X - µ 0 / * s, X + t t distributio has wider tails (more coservative ' ( T = X - µ 0 s/ April 6, 06 Stat - Lecture 6 - Review 7 Iferece: Comparig Meas µ ad µ Kow ad : two-sample Z statistic uses ormal distributio (X Z = - X Matched pairs: istead of differece of two samples X ad X, do a oe-sample test o the differece d T = X - 0 d * s X d ± t d ' s d / ( April 6, 06 Stat - Lecture 6 - Review 8 + Ukow ad : two-sample T statistic uses t distributio with degrees of freedom = mi( -, - T = (X - X " s + s * s X - X ± t k + s ' # 3

Iferece: Proportio p Cofidece iterval for p uses the Normal distributio ad the sample proportio: p ˆ ± Z * p ˆ ( p ˆ ' where p ˆ = Y ( Hypothesis test for p = p 0 also uses the Normal distributio ad the sample proportio: Z = ˆ p p 0 p 0 ( p 0 Iferece: Comparig Proportios p ad p Hypothesis test for p - p = 0 uses Normal distributio ad complicated test statistic p ˆ Z = p ˆ SE( p ˆ p ˆ with pooled stadard error: SE( p ˆ p ˆ = p ˆ p ( ˆ where p ˆ = Y ad p ˆ = Y # p p + ( where p ˆ p = Y + Y ' + Cofidece iterval for p = p also uses Normal distributio ad sample proportios p ˆ p ˆ p ˆ ± Z * ( p ˆ + p ˆ ( p ˆ ' ( April 6, 06 Stat - Lecture 6 - Review 9 April 6, 06 Stat - Lecture 6 - Review 0 Liear Regressio Use best fit lie to summarize liear relatioship betwee two cotiuous variables X ad Y: Y i = α + β X i The slope ( b = r s y /s x : average chage you get i the Y variable if you icreased the X variable by oe The itercept ( a = Y b X : average value of the Y variable whe the X variable is equal to zero Liear equatio ca be used to predict respose variable Y for a value of our explaatory variable X Sigificace i Liear Regressio Does the regressio lie show a sigificat liear relatioship betwee the two variables? H 0 : β = 0 versus H a : β 0 Uses the t distributio with - degrees of freedom ad a test statistic calculated from JMP output b T = SE(b Ca also calculate cofidece itervals usig JMP output ad t distributio with - degrees of freedom ( b ± t * SE(b ( a ± t * SE(a April 6, 06 Stat - Lecture 6 - Review April 6, 06 Stat - Lecture 6 - Review Urba Aalytics i Philadelphia Quatitative aalysis of the ecoomic ad social fuctioig of local areas withi large cities Philadelphia is a iterestig case study for cotemporary issues i urba revival ad getrificatio Creatig empirical measures for cocepts like urba vibracy that have bee difficult to quatify Examied associatios betwee crime, poverty, demographics ad lad use Urba Aalytics i Philadelphia It is importat to do quatitative aalysis of large cities carefully ad at the correct level of resolutio What we see whe we look at the city i the aggregate ca be quite differet tha specific eighborhoods Both sides of the classic Jae Jacobs vs. Urba reewal fight were based o empirical argumets Jacobs key iovatio was basig her observatios at a high resolutio: idividual streets ad blocks rather tha aggregatig over etire cities April 6, 06 Stat - Lecture 6 - Review 3 April 6, 06 Stat - Lecture 6 - Review 4 4

Last Class! Thaks everyoe for a great semester! See you o May 0 th for the fial exam! April 6, 06 Stat - Lecture 6 - Review 5 5