Summary of the lecture in Biostatistics

Similar documents
Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Lecture Notes Types of economic variables

Chapter 5 Properties of a Random Sample

CHAPTER VI Statistical Analysis of Experimental Data

Special Instructions / Useful Data

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Econometric Methods. Review of Estimation

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Chapter 13 Student Lecture Notes 13-1

X ε ) = 0, or equivalently, lim

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

Lecture 3. Sampling, sampling distributions, and parameter estimation

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

22 Nonparametric Methods.

Lecture 3 Probability review (cont d)

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Continuous Distributions

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

STA302/1001-Fall 2008 Midterm Test October 21, 2008

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Objectives of Multiple Regression

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Simulation Output Analysis

Statistics MINITAB - Lab 5

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 14 Logistic Regression Models

Chapter 11 The Analysis of Variance

Simple Linear Regression

MEASURES OF DISPERSION

ESS Line Fitting

Module 7: Probability and Statistics

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Simple Linear Regression

Functions of Random Variables

Third handout: On the Gini Index

Module 7. Lecture 7: Statistical parameter estimation

ρ < 1 be five real numbers. The

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Parameter, Statistic and Random Samples

Chapter 8: Statistical Analysis of Simulated Data

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Lecture 1 Review of Fundamental Statistical Concepts

Chapter 4 Multiple Random Variables

The expected value of a sum of random variables,, is the sum of the expected values:

: At least two means differ SST

Simple Linear Regression - Scalar Form

Chapter 3 Sampling For Proportions and Percentages

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

ENGI 4421 Propagation of Error Page 8-01

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Linear Regression Analysis

Lecture 07: Poles and Zeros


Chapter Two. An Introduction to Regression ( )

A Study of the Reproducibility of Measurements with HUR Leg Extension/Curl Research Line

Bootstrap Method for Testing of Equality of Several Coefficients of Variation

Class 13,14 June 17, 19, 2015

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

GOALS The Samples Why Sample the Population? What is a Probability Sample? Four Most Commonly Used Probability Sampling Methods

Introduction to local (nonparametric) density estimation. methods

Lecture 8: Linear Regression

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

C. Statistics. X = n geometric the n th root of the product of numerical data ln X GM = or ln GM = X 2. X n X 1

Law of Large Numbers

Chapter 4 Multiple Random Variables

Correlation and Simple Linear Regression

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

TESTS BASED ON MAXIMUM LIKELIHOOD

5.1 Properties of Random Numbers

Random Variate Generation ENM 307 SIMULATION. Anadolu Üniversitesi, Endüstri Mühendisliği Bölümü. Yrd. Doç. Dr. Gürkan ÖZTÜRK.

The Mathematical Appendix

Descriptive Statistics

Goodness of Fit Test for The Skew-T Distribution

= 1. UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Parameters and Statistics. Measures of Centrality

Arithmetic Mean Suppose there is only a finite number N of items in the system of interest. Then the population arithmetic mean is

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Point Estimation: definition of estimators

Multiple Choice Test. Chapter Adequacy of Models for Regression

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

STATISTICAL INFERENCE

Chapter 11 Systematic Sampling

Analysis of Variance with Weibull Data

STK4011 and STK9011 Autumn 2016

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

CODING & MODULATION Prof. Ing. Anton Čižmár, PhD.

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Chapter -2 Simple Random Sampling

PROPERTIES OF GOOD ESTIMATORS

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

Transcription:

Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the probabltes assocated wth a radom varable. Example The tme utl a chemcal reacto s complete s defed by ts F(x) as the followg. 0 x < 0 F( x) 0. 0x e x 0 a) Fd ts f(x). b) What proporto of reacto s complete wth 00 ms? a) df( x) 0 x < 0 0. 0x dx 0.0e x 0 b) p ( x < 00) F(00) e 0. 8647 Hstogram A hstogram s a approxmato to a probablty desty fucto. For each terval of the hstogram, the area of the bar equals the relatve frequecy (proporto) of the measuremets the terval. The relatve frequecy s a estmate of the probablty that a measuremet falls the terval. Cumulatve Dstrbuto Fucto The cumulatve dstrbuto fucto of a cotuous radom varable s defed as x F ( x) x) f ( u) du for ay x. Mea or Expected Value If s a cotuous radom varable wth probablty desty fucto f(x), the mea or expected value of s defed as µ E ( ) x dx

However, f we have just some observatos of, deoted as x, x,, x, the sample mea s: x µ. x. If we have the whole populato, the the populato mea s Varace & Stadard Devato If s a cotuous radom varable wth probablty desty fucto f(x), the varace of s defed as V ( ) ( x µ ) dx x dx µ, where s the stadard devato of. ( x x) I case of havg observatos of, the sample varace s defed as s, where s s the sample stadard devato. Smlarly the populato varace s defed as ( x µ) s. (Do you remember the reasog behd havg the - dvsor sample varace?) ormal Dstrbuto A radom varable wth probablty desty fucto ( x ( x) e π µ ) f for has a ormal dstrbuto wth parameters µ, where µ, ad 0. Some useful results cocerg a ormal dstrbuto for ay radom varable are: µ < < µ + 68% µ < < µ + ) 95% µ 3 < < µ + 3 ) 99% ) x, µ The radom varable defed as s a ormal radom varable wth E()0 ad V(). That s, s a stadard ormal radom varable. Ceteral Lmt Theorem (oe of the most useful theorems Statstcs) If s a radom varable wth sample sze, take from a populato wth mea µ ad fte varace, ad f s the sample mea, the the lmtg form of the dstrbuto of µ as, s the stadard ormal dstrbuto.

Correlato Coeffcet The correlato betwee radom varables ad, deoted as ρ ( ( ) )( ) ( ) cov(, ) ρ s defed as E [( µ )( µ )] E[ ( ] Covarace (ad hece, correlato) s a measure of lear assocato betwee two radom varables. If the relatoshp s o-lear, covarace s ot sestve to the relatoshp. If two radom varables are depedet, the ther covarace (ad therefore ther correlato) would be zero, whle the opposte s ot ecessarly true. Wde Sese Statoary Sgals Whe both the mea ad varace of a radom process are tme varat ad also ts autocorrelato fucto s ot a fucto of tme but oly a fucto of the tme dfferece betwee the two pots, the the process s called statoary the wde sese. Ergodcty Whe the statstcal averagg over the esamble equals the tme averagg over the tme axs of ay sample fucto, the process s called ergodc. It s obvous that a ergodc process has to be a statoary process as well. Hypothess, p-value, Type I & II Errors, Power A statstcal hypothess (H 0 ) s a statemet about the parameters of oe or more populato. The p-value s the smallest level of sgfcace that would lead to rejecto of the ull hypothess wth the gve data. Rejectg the ull hypothess whe t s true, s defed as a type I error ad alpha s the probablty of makg such a error. That s, α type I error) reject H 0 whe H 0 s true). Alpha s fact same as p-value. Falg to reject the ull hypothess, H 0, whe t s false s defed as a type II error ad beta s the probablty of makg such a error. That s, β type II error) fal to reject H 0 whe H 0 s false). The power of a statstcal test s the probablty of rejectg the ull hypothess whe the alteratve hypothess s true. The power s computed as -β. Power s a very descrptve ad cocse measure of the sestvty of a statstcal test, where by sestvty we mea the ablty of the test to detect dffereces. Example To llustrate the above deftos by a example, cosder the ull hypothess that states the heght of a average Caada fat s 50 cm wth a stadard devato of.5 cm. Let's assume that we would lke to test ths hypothess by a sample of 0 fats. Therefore our H 0 ad H are: H 0 : µ 50cm H : µ 50cm µ µ 3

The sample mea s a estmate of the true populato mea µ. A value of the sample mea x that falls close to the hypotheszed value of µ 50 cm, s evdece that the true mea µ s really 50 cm; that s, such evdece supports the ull hypothess H 0. O the other had, a sample mea that s cosderably dfferet from 50 cm s evdece support of the alterate hypothess H. Thus, the sample mea s the test statstc ths case. ow, let's assume that f the sample mea falls betwee 48.5 to 5.5, we wll ot reject the ull hypothess. Therefore, these values costtute the crtcal rego for the test ad essece determes the p-value or the cofdece of ot makg type I error. Let's see how t defes the p-value. α x < 48.5 whe µ 50) + x >5.5 whe µ 50), ow to fd ths probablty we have to use Cetral Lmt Theorem ( µ ) to fd the z-values so that the we ca use the ormal dstrbuto table for gettg the probabltes. The z-values that correspods to the crtcal values of 48.5 ad 5.5 are 48.5 50.90 ad.5 0 5.5 50.90. Therefore,.5 0 α z <.9) + z >.9) 0.0877 +.0877 0.057434 5% Ths mples that about 5% of all radom samples would lead to rejecto of the ull hypothess (µ 50 cm) whe the true mea s really 50 cm. We ca decrease ths rate (or o the other word crease the cofdece terval) by wdeg the acceptace rego. For example, f we make the crtcal values 48 ad 5 stead of 48.5 ad 5.5, the 48 50 5 50 α < ) + > ) 0.04 %..5 0.5 0 We could also reduce α by creasg the sample sze wthout wdeg the acceptace rego. For example f we choose 6 stead of a sample sze of 0, the we have 48.5 50 5.5 50 α < ) + > ) 0.064.6%..5 6.5 6 ow, let's see what s the probablty of type II error. To calculate β we must have a specfc alterate hypothess; that s, we must have a partcular value of µ. For example, suppose that t s mportat to reject the ull hypothess wheever the mea heght s greater tha 5 cm or less tha 48 cm. Because of symmetry of the ormal dstrbuto, t s eough to calculate oly oe of the cases - say fd the probablty of acceptg the ull hypothess (µ 50 cm) whe the true mea s 5 cm. Therefore, β 48.5 x 5.5 whe µ 5). The the correspodg z - values are : 48.5 5 4.43 ad.5 0 5.5 5 0.63. Therefore,.5 0 β 4.43 z 0.63) 0.63) 4.43) 0.643 + 0.0000 0.643 6.4%. Thus, f we are testg H : µ 50cm agast H : µ 50cm wth 0, ad the true 0 4

value of the mea s µ 5 cm, the probablty that we wll fal to reject the false ull hypothess s about 6.4%. By symmetry, f the true value of the mea s 48, the value of β wll also be 6.4%. The probablty of makg type II error creases rapdly as the true value of µ approaches the hypotheszed value. The type II error probablty also depeds o the sample sze ; creasg the sample sze results a decrease the probablty of type II error. The above results ca be summarzed as the followg.. The sze of the crtcal rego, ad cosequetly the probablty of type I error, α, ca always be reduced by approprate selecto of the crtcal values.. Type I ad II errors are related. A decrease the probablty of oe type of error always results a crease the probablty of the other, provded that the sample sze does't chage. 3. A crease sample sze wll geerally reduce the both α ad β, provded that the crtcal values are held costat. 4. Whe the ull hypothess s false, β creases as the true value of the parameter approaches the value hypotheszed the ull hypothess. The value of β decreases as the dfferece betwee the true mea ad the hypotheszed value creases. Geerally, the aalyst cotrols the type I error probablty α whe s/he selects the crtcal values. Thus, t s usually easy for the aalyst to set the type I error probablty at (or ear) ay desred value. Sce the aalyst ca drectly cotrol the probablty of wrogly rejectg H 0 as a strog cocluso. Aother mportat cocept s the power of the statstcal test, whch s fact -β. The power ca be terpreted as the probablty of correctly rejectg a false ull hypothess. Statstcal tests are ofte beg compared wth ther power propertes. For example, cosder the case that true mea heght s 5 cm ad that case, we foud β 0.643. So, the power of ths test s -β 0.7357. Power s a very descrptve ad cocse measure of the sestvty of a statstcal test, where by sestvty we mea the ablty of the test to detect dffereces. For example, the latest case dscussed above, a power of 0.7357 whle the true mea s 5 cm, meas that the test wll correctly reject the ull hypothess H 0 50 cm, 73.57% of the tme. If ths value of power s judged to be too low, the aalyst ca crease ether α or the sample sze. 5