BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics

Similar documents
Statistics 511 Additional Materials

1 Inferential Methods for Correlation and Regression Analysis

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Chapter 23: Inferences About Means

Linear Regression Demystified

Topic 10: Introduction to Estimation

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

Chapter 8: Estimating with Confidence

Topic 9: Sampling Distributions of Estimators

Understanding Samples

Confidence Intervals for the Population Proportion p

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Sampling Distributions, Z-Tests, Power

Chapter 6 Sampling Distributions

Stat 421-SP2012 Interval Estimation Section

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

2 Definition of Variance and the obvious guess

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Properties and Hypothesis Testing

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Infinite Sequences and Series

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Understanding Dissimilarity Among Samples

Final Examination Solutions 17/6/2010

Distribution of Random Samples & Limit theorems

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Module 1 Fundamentals in statistics

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

Expectation and Variance of a random variable

GG313 GEOLOGICAL DATA ANALYSIS

Chapter 18 Summary Sampling Distribution Models

Topic 9: Sampling Distributions of Estimators

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Statistics 20: Final Exam Solutions Summer Session 2007

4.1 Sigma Notation and Riemann Sums

Topic 9: Sampling Distributions of Estimators

Chapter 6. Sampling and Estimation

7.1 Convergence of sequences of random variables

PubHlth 540 Fall Estimation Page 1 of 72. Unit 6. Estimation. Use at least twelve observations in constructing a confidence interval

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Data Analysis and Statistical Methods Statistics 651

Central Limit Theorem the Meaning and the Usage

Simulation. Two Rule For Inverting A Distribution Function

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

6 Sample Size Calculations

4.3 Growth Rates of Solutions to Recurrences

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

18.S096: Homework Problem Set 1 (revised)

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

6.3 Testing Series With Positive Terms

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Lecture 7: Properties of Random Samples

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

Random Variables, Sampling and Estimation

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

1 Lesson 6: Measure of Variation

Mathematical Induction

Estimation for Complete Data

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Homework 5 Solutions

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

This is an introductory course in Analysis of Variance and Design of Experiments.

Simple Linear Regression

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Additional Notes and Computational Formulas CHAPTER 3

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Introducing Sample Proportions

Confidence Interval for one population mean or one population proportion, continued. 1. Sample size estimation based on the large sample C.I.

Computing Confidence Intervals for Sample Data

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

The standard deviation of the mean

NUMERICAL METHODS FOR SOLVING EQUATIONS

STAC51: Categorical data Analysis

Introducing Sample Proportions

Stat 139 Homework 7 Solutions, Fall 2015

Stat 200 -Testing Summary Page 1

P.3 Polynomials and Special products

Unit 6 Estimation Week #10 - Practice Problems SOLUTIONS

Eco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

The Growth of Functions. Theoretical Supplement

Transcription:

BIOTAT 640 Itermediate Biostatistics Frequetly Asked Questios Topic FAQ Review of BIOTAT 540 Itroductory Biostatistics. I m cofused about the jargo ad otatio, especially populatio versus sample. Could you please clarify? I BIOTAT 540, we were itroduced to the otio of there beig a populatio i the backgroud: a source populatio, about which we would like to lear). We were also itroduced to the idea of a sample draw from that populatio: a collectio of actual, observed ad kow, data values that we will use to draw some ifereces about the populatio. We were remided that, i real life, typically we do ot have the luxury of examiig the etirety of a populatio (that would be a cesus). As regards jargo ad otatio, the covetio is to use greek letters to represet characteristics of the source populatio (ad we referred to these as parameters ad parameter values) ad roma letters to represet characteristics of the sample. Remider: a statistic is just a umber that you calculate from the data i a sample. o here is a little refresher schematic that we might have compiled i BE540 so as to keep track - Parameter i Populatio Estimate from ample Mea µ X Variace etc Here i BIOTAT 640, whe we lear about regressio ad correlatio, a similar compilatio allows us to keep track of what s what. Keep i mid that, typically, the statistic we calculate is calculated as our guess of the parameter i the populatio. Parameter i Populatio lope of lie of Y o X β Itercept of lie of Y o X β 0 Estimate from ample β or b β or b 0 0 Additioal ote to reader: ee the little hat o top? Wheever you see the little hat o top, this is tellig you that what you are lookig at is a estimate. It s ufortuate that the letter is greek but the key is to otice that the little hat meas the quatity is a estimate obtaied from the data ad is therefore a statistic. The little hat also goes by the ame caret. Wheever you see it, thik estimate. FAQTopic_.docx Page of 5

. Remid me agai of the distictio betwee stadard deviatio (D or ) ad stadard error (EM or E) ad how this is related to the distictio betwee populatios versus samplig distributios. D or - tadard deviatio is the EM or E - tadard error is the variace of values of idividuals i ature. variace of values of a statistic. The collectio of all possible idividuals i ature goes by the ame populatio The collectio of all possible values of a statistic (imagie replicatig your study over ad over a gazillio times ad compilig the collectio of all possible sample meas X ) does ot go by the ame populatio, eve though this would make sese. Istead, this collectio of all possible values of whatever statistic you re iterested i goes by the ame samplig distributio. o who cares? Well, actually there are times whe we are very iterested i the samplig distributio of X (eg cliical trials). Ad there are times whe we might be iterested i the samplig distributio of (eg studies of lab performace). By extesio, we ca imagie that there might be times whe we re iterested i some other statistic. I our uit o regressio ad correlatio for example, we will see that we are iterested i the samplig distributio of a estimated slope, β I guess I do t see why I d be iterested i the samplig distributio of β You re iterested i the samplig distributio of β whe you re iterested i what aother ivestigator might obtai as a β if he/she were to repeat your study ad come up with his/her ow estimate. Whether you re aware of this cosciously or ot, this is the kid of thig you are iterested i (geeralizability, robustess are some familiar terms for this) whe you read a joural article ad are watig to kow if you would obtai similar fidigs if you were to repeat the published study i your ow sample of folks. FAQTopic_.docx Page of 5

3. Ick. I do t uderstad summatio otatio. Ufortuately, otatio does get i the way of uderstadig ideas sometimes. The summatio otatio is othig more tha a secretarial coveiece. We use it to avoid havig to write out log expressios. For example, Istead of writig x + x + x + x + x, We write 5 x i i= 3 4 5 Aother example Istead of writig x * x * x * x * x, We write 5 i= x i 3 4 5 This is actually a example of the product otatio Key to the summatio otatio The Greek symbol sigma says add up some items Below the sigma symbol is the startig poit TARTING HERE END Up o top is the edig poit FAQTopic_.docx Page 3 of 5

4. What are Z-scores, what are t-scores ad what is the distictio betwee them? The Z-core is a tool to compute probabilities of itervals of values for X distributed Normal(µ, ). uppose it is of iterest to calculate a probability for a radom variable X that is distributed Normal(µ, ). ometimes (less so as time goes o because iteret resources are gettig better all the time), we re i a pickle because tabulated ormal probabilities are available oly for the Normal Distributio with µ = 0 ad =. We solve our problem by exploitig a equivalece argumet. The techique goes by the ame stadardizatio ad ivolves replacig the desired calculatio with a equivalet oe for a ew radom variable called a z- score. tadardizatio expresses the desired calculatio for X distributed Normal(µ, ) as a equivalet calculatio for Z (Z is ow called a Z-score) where Z is distributed stadard ormal, Normal(0,). a-µ b-µ pr[ a X b ] =pr Z. Thus, Z-score = X µ Note - The techique of stadardizatio of X ivolves ceterig (by subtractio of the mea of X which is µ) followed by rescalig (usig the multiplier /) Watch out whe you are performig stadardizatio that the re-scalig is with the correct variace. Here are 3 examples followed by a geeric, just to be sure that you get the idea: a-µ b-µ pr a X b =pr Z-score. [ ] a-µ b-µ. pr a X b =pr Z-score / / 3. a-e(β ) b-e(β ) pr a β b =pr Z-score E(β ) E(β) a-e(statistic) b-e(statistic) pr a statistic b =pr Z-score E(statistic) E(statistic) 4. [ ] FAQTopic_.docx Page 4 of 5

The z-score method is appropriate uder circumstaces: () whe the startig variable is distributed Normal to begi with, ad () whe the startig variable ca be appreciated as a istace of the cetral limit theorem (ot discussed here). A t-score is a studet s t radom variable. There s lots of ways to have a radom variable that is distributed studet s t. Oe is to coceive of a studet s t radom variable as a t-score ad i this way, aalogous to a z-score. Oe defiitio of a studet s t radom variable: I the settig of a radom sample X...X of idepedet, idetically distributed outcomes of a Normal(µ, ) distributio, where we calculate X ad i the usual way: i= X= X i ad = i= ( ) Xi X - a studet s t distributed radom variable results if we costruct a t-score istead of a z-score. µ is distributed tudet s t with degrees of freedom = (-) t - score = t = X - DF=- s / Note If we wat to stadardize X, the solutio depeds o whether we kow its E or we do t. E(X) is kow E(X) is NOT kow tadardizatio of X X-µ Z-score= E(X) X-µ t-score= E(X) Where E(X)= E(X)= Recall = i= (X -X) i (-) FAQTopic_.docx Page 5 of 5