Estimation for Complete Data

Size: px
Start display at page:

Download "Estimation for Complete Data"

Transcription

1 Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of each idividual is recorded. A group data is used whe the populatio uder study is large, so we put idividual observatios ito groups.

2 . Empirical distributio for complete idividual data (sectio.2) The desity fuctio of the populatio from which the data is collected is deoted by f(x). Its distributio fuctio is deoted by F (x). Its survival fuctio is deoted by S(x). Its hazard rate fuctio is deoted by h(x). Fially its cumulative hazard fuctio is deoted by H(x) ad is defied by H(x) = l S(x) Note that for cotiuous radom variables we have H (x) = S (x) S(x) = f(x) S(x) = h(x) To estimate these fuctios we collect a sample from the populatio ad the usig the data such collected we create some fuctios that are used to estimate these fuctios. First of all suppose that we have collected data poits, some of which might be repeated poits. We assume that the collected data represets the whole populatio, so we actually we assume that it is the populatio itself. Sice we have ot kowledge about the distributio of the data (ad that s why the estimatio comes i) the data poits have o privilege over each other ad therefore we assig the probability desity fuctio, which is deoted by f (x), is to each data poit. Therefore the associated f (x) = for each observed value It is called the empirical desity fuctio The distributio fuctio associated with this desity is called the empirical distributio fuctio ad is deoted by F (x) F (x) = umber of observatios beig less tha or equal to x The empirical survival fuctio is S (x) = F (x) 2

3 The empirical hazard rate fuctio is h (x) = f (x) S (x) Ad fially, the empirical cumulative hazard fuctio is H (x) = l S (x) Example (data from the Fia s ote). The followig loss values have bee obtaied 4, 50, 50, 50, 60, 75, 80, 20, 30 Calculate f (x), F (x), S (x), h (x), ad H (x) for all x Solutio. x otherwise f (x) x < 4 x < 4 4 x < x < 50 F (x) = 4 50 x < x < x < 80 S (x) = 5 50 x < x < x < x < x < x < x 20 x < x 3

4 H (x) = l S (x) = 0 x < x < x < x < x < x < x < x x otherwise h (x) udefied udefied Aother way of estimatig the cumulative hazard fuctio is by meas of Nelso-Åale estimatio. Before itroducig this estimatio, we eed to itroduce some otatios: For the observed values {x,,..., x } let y < y 2 < < y k be the uique values of the x i s. Let s j = i I(x i = y i ) be the umber of times the observatio y j appears i the sample; here I deote the logical fuctio that returs if the argumet is true ad returs zero if the argumet is false. Note that s + + s k = Example. For the data set of the previous example we have 4

5 y j s j total For each j we associate a so-called risk set r j as follows: r j = the umber of observatios greater tha or equal to y j = the sum of those s values with idices beig larger tha or equal to j = s j + s j+ + + s k = i=k i=j s i Note that the r j s are decreasig ad So we have r j = s j + r j. Note. I some mauscripts you fid a better otatio (i ) istead of r i because r i is the umber of idividuals at time i before ay evet at that time. I cotiuatio, ote that with this otatio, for every x satisfyig y j x < y j we have F (x) = F (y j ) = = umber of idividuals beig less tha or equal to y j j i= s i = k i=j s i = r j so 5

6 F (x) = 0 x < y r j y j x < y j j = 2,..., k y k x S (x) = x < y r j y j x < y j j = 2,..., k 0 y k x 0 x < y H (x) = l(s (x)) = l ( r j ) y j x < y j j = 2,..., k udefied y k x Note that from r i s i = r i+ we have r j = r j r = Therefore: r j r j r2 j r i+ = = r j r j 2 r r i= i j i= r i s i r i = j i= ( s ) i r i 6

7 0 x < y H (x) = l j i= ( ) s i r i y j x < y j j = 2,..., k udefied y k x We recall from calculus that the values ad therefore for x < we ca approximate e x = + x + x2 2! + x3 3! + e x + x ad by chagig x to x we get Because of this, we approximate s i r i approximatio j i= e x x x < ( ) by exp s i r i ad therefore we will have the ( s ) j ( i exp s ) ( i = exp r i r i i= j i= ) s i r i The j l i= ( s ) j i s i r i r i= i So, H (x) that we have above, ca be approximated with the the Nelso-Åale estimate : 7

8 0 x < y Ĥ(x) = j i= s i r i y j x < y j j = 2,..., k k i= s i r i y k x Oce this is foud, the by settig Ŝ(x) = exp ( Ĥ(x) ) we have a estimatio for the survival fuctio. Example (from the Fia s ote) The followig loss values have bee obtaied 4, 50, 50, 50, 60, 75, 80, 20, 30 Calculate f (x), F (x), Fid the Nelso-Åale estimate for the cumulative hazard fuctio ad the estimate the survival fuctio. Solutio. Ĥ(x) = 0 x < 4 4 x < = x < = x < = x < = x < = x < = x 8

9 ( Ĥ(x) ) Ŝ(x) = exp = x < x < x < x < x < x < x < x

10 2. Empirical distributio for grouped data (sectio.3) For grouped data we divide the values ito k itervals: (c 0, c ], (c, c 2 ],..., (c k, c k ] Suppose that there are observatios i total, ad the umber of those who fall ito the iterval (c j, c j ] is j ; so k j= j =. We the defie F at the edpoits of those itervals by: F (c 0 ) = 0 F (c j ) = j i= i j =,..., k ad the we use a liear iterpolatio to defie F (x) for either poits of the itervals F (x) = c j x c j c j F (c j ) + x c j c j c j F (c j ) c j x c j Its graph is called a ogive. By differetiatig this fuctio we get the estimate for the estimate for the desity fuctio : f (x) = F (c j ) F (c j ) c j c j = j (c j c j ) c j x < c j Its graph is called a histogram. Note. This desity is a spliced desity fuctio. Note. By takig complemets of both sides of the iterpolatio relatioship, we get S (x) = c j x c j c j S (c j ) + x c j c j c j S (c j ) c j x c j Example (from the Fia s ote)- 50. ad Calculate the empirical distributio fuctio ad the empirical desity fuctio for the followig grouped data. 0

11 Iterval Number of observatios (0, 2] 25 (2, 0] 0 (0, 00] 0 (00, 000] 5 Solutio. We first fid F 50 at the edpoits: F 50 (0) = 0 F 50 (2) = = 0.5 F 50 (0) = = 0.7 F 50 (00) = = 0. F 50 (000) = We the calculate F 50 at other poits usig iterpolatio; for example for the iterval (0, 00] we have: F 50 (x) = x 0 00 x (0.) Similarly we will have: x 4 0 x 2 (0.7) = x 0 0 (0.) + 00 x (0.7) = x x x 0 F 50 (x) = x x 00 x x 000 udefied 000 < x

12 4 0 x < x < 0 f 40 (x) = x < x < 000 udefied 000 x 2

13 3. Mea ad Variace of Empirical Estimators for Complete Idividual Data (from sectio 2.2) Let S (x) be the empirical survival fuctio. This radom variable is used to estimate the true value of S(x). For hypothesis testig ad some other reasos we eed the mea ad the variace of the estimator. To fid these quatities of iterest, we start from S (x) = umber of observatios that larger tha x = Y where Y is the umber of observatios that are bigger tha x. If {X,..., X } deote the values observed, this set is a i.i.d. from a populatio i which the probability of beig larger tha x is S(x), ad Y is the umber of this cases i our sample. So, Y is distributed as Biomial(, S(x)), so E(Y ) = S(x) Var(Y ) = S(x) ( S(x)) The from S (x) = Y E(S (x)) = E(Y ) we get : = S(x) Var(S (x)) = 2 Var(Y ) = S(x) ( S(x)) So, first of all, S (x) is a ubiased estimator for the value S(x) (which is ukow to us ad that s why we do samplig), ad secodly, it is a cosistet estimator. So it is a ubiased cosistet estimator. Note. Sice we do ot have the values S(x), we may ot be able to actually calculate the variace Var(S (x)), therefore i this formula we istead use S (x) for S(x) as S (x) is a 3

14 ubiased estimator of S(x). So we use Var[S (x)] = S (x)( S (x)) Questio. I the above discussio we foud a ubiased estimator for the probability S(x) = P (X > x). How about the probabilities P (a X b), P (a X < b), P (a < X b), P (a < X < b), P (X < x), P (X x),... Ca we have reasoable ubiased cosistet estimators for them?. Aswer. Yes. I fact the above argumet works for these cases too. For example if you wat to approximate P (a X b), the the estimator W = umber of observatios X i that satisfy a X i b is a ubiased estimator for P (a X b). Note. For a < b, to approximate the coditioal probabilities P (X > b X > b) = S(b) S (b) S (a). The approximate value for the variace would be ) ( S (b) S (a) S (b) S (a) #(X i > a) S(a) we use Example (data from the Fia s ote). The followig loss values have bee obtaied 4, 50, 50, 50, 60, 75, 80, 20, 30 Calculate f (x), F (x), S (x), h (x), ad H (x) for all x (i) What is the estimated variace of the estimator for P (X > 60). (ii) What is the estimated variace of the estimator for P (75 < X 20). (iii) What is the estimated variace of the estimator for P (X > 60 X > 50). 4

15 Solutio. (i). We estimate the variace by S (60)( S (60)) = ( 4 )( 5 ) = (ii). The approximate value for P (75 < X < 20) is umber of observatios X i that satisfy 75 < X i 20 = 2 So the the approximate value of the variace would be ( 2 )( 7 ) = 4 72 (iii). The approximate value for the coditioal probability P (X > 60 X > 50) = S(60) S(50) is S (60) S (50) = 4 5. The estimate value for the variace is ( 4 5 )( 5 ) 5 =

16 4. Mea ad Variace of Empirical Estimators for Grouped Data (from sectio 2.2) With the otatios used for grouped data, recall the followig two idetities: S (c j ) = F (c j ) = + + j S (x) = c j x c j c j S (c j ) + x c j c j c j S (c j ) c j x c j For a momet let Y be the umber of observatios up to c j : Y = + + j The Y is distributed as Biomial(, S(c j )) Let Z be the umber of observatios i (c j, c j ]. The Z is distributed as Biomial(, S(c j ) S(c j )). The: S (c j ) = Y S (c j ) = Y +Z The : E[S (c j )] = E(Y ) Similarly, = ( S(c j )) = S(c j ) E[S (c j )] = S(c j ) 6

17 The by takig the liear iterpolatio we get: x (c j, c j ] E[S (x)] = c j x c j c j E[S (c j )] + x c j c j c j E[S (c j )] = c j x c j c j S(c j ) + x c j c j c j S(c j ) O the other had: x (c j, c j ] S (x) = c j x c j c j ( ) Y x c + j ( ) c j c j Y +Z {( ) ( ) } cj x = Y c j c j + x cj Y +Z c j c j = { } (cj c j )Y +(x c j )Z (c j c j ) ad therefore Var[S (x)] = (c j c j ) 2 Var(Y ) + (x c j ) 2 Var(Z) + 2(c j c j )(x c j )Cov(Y, Z) 2 (c j c j ) 2 where: Var(Y ) = S(c j )[ S(c j )] Var(Z) = [S(c j ) S(c j )][ S(c j ) + s(c j )] Cov(Y, Z) = [ S(c j )][S(c j ) S(c j )] why? I the followig example we will see how to calculate this variace or its estimate without havig to memorize this formula. Sice the values o the right-had side are ukow, we substitute them with their estimates: we substitute S(c j ) with S (c j ), ad substitute S(c j ) with S (c j. This results i the estimates: 7

18 Var(Y ) = Y ( Y ) Var(Z) = Z( Z) Cov(Y, Z) = Y Z Example. What are E[f (x)] ad Var[f (x)]?. Solutio. We have f (x) = j (c j c j ) = Z (c j c j ) c j x < c j E[f (x)] = Var[f (x)] = E(Z) (c j c j ) = (S(c j ) S(c j )) = S(c j ) S(c j ) (c j c j ) c j c j Var(Z) 2 (c j c j ) 2 But we estimate this variace with Var[f (x)] = where Var(Z) Z( Z) = Var(Z) 2 (c j c j ) 2 Example (from the Fia s ote) For the followig grouped data, estimate the probability that a loss will be o more tha 0. Estimate the variace of the associated estimator. 8

19 Iterval Number of observatios (0, 2] 25 (2, 0] 0 (0, 00] 0 (00, 000] 5 Solutio. Sice the poit 0 is i the iterval (0, 00] we have: S (x) = c j x c j c j S (c j ) + x c j c j c j S (c j ) c j x c j S 50 (0) = 5 50 = 3 0 S 50 (00) = 5 50 = 0 ad the weights: right-ed weight right-ed weight = 80 0 = 8 S 50 (0) = ( )( 3 0 ) + (8 ) 0 = 0 So the aswer to the first part is: S 50 (0) = 7 0 = For the secod part: Y = umber of observatios i the iterval (0, 0] = 35 Var(Y ) = Y ( Y ) (35)(50 35) = = = 0.5 Z = umber of observatios i the iterval (0, 00] = 0

20 Var(Z) Y ( Z) = = (0)(40) = 8 50 Cov(Y, Z) = Y Z = (35)(0) = 7 50 S (x) = (c j c j )Y + (x c j )Z (c j c j ) S 50 (x) = (0)Y + (80)Z (50)(0) = 50 Y Z Var( S 50 (x)) = ( 50 )2 Var(Y ) + ( )2 Var(Z) + 2( 50 )( ) Cov(Y, Z) = ( 50 )2 (0.5) + ( )2 (8) + 2( 50 )( )( 7) =

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes. Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret

More information

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 9. POINT ESTIMATION 9. Covergece i Probability. The bases of poit estimatio have already bee laid out i previous chapters. I chapter 5

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences. Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx

More information

Please do NOT write in this box. Multiple Choice. Total

Please do NOT write in this box. Multiple Choice. Total Istructor: Math 0560, Worksheet Alteratig Series Jauary, 3000 For realistic exam practice solve these problems without lookig at your book ad without usig a calculator. Multiple choice questios should

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam 4 will cover.-., 0. ad 0.. Note that eve though. was tested i exam, questios from that sectios may also be o this exam. For practice problems o., refer to the last review. This

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS STRUCTURE OF EXAMINATION PAPER. There will be oe 2-hour paper cosistig of 4 questios.

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

November 2002 Course 4 solutions

November 2002 Course 4 solutions November Course 4 solutios Questio # Aswer: B φ ρ = = 5. φ φ ρ = φ + =. φ Solvig simultaeously gives: φ = 8. φ = 6. Questio # Aswer: C g = [(.45)] = [5.4] = 5; h= 5.4 5 =.4. ˆ π =.6 x +.4 x =.6(36) +.4(4)

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

Last Lecture. Wald Test

Last Lecture. Wald Test Last Lecture Biostatistics 602 - Statistical Iferece Lecture 22 Hyu Mi Kag April 9th, 2013 Is the exact distributio of LRT statistic typically easy to obtai? How about its asymptotic distributio? For testig

More information

Areas and Distances. We can easily find areas of certain geometric figures using well-known formulas:

Areas and Distances. We can easily find areas of certain geometric figures using well-known formulas: Areas ad Distaces We ca easily fid areas of certai geometric figures usig well-kow formulas: However, it is t easy to fid the area of a regio with curved sides: METHOD: To evaluate the area of the regio

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics SDS 321: Itroductio to Probability ad Statistics Lecture 23: Cotiuous radom variables- Iequalities, CLT Puramrita Sarkar Departmet of Statistics ad Data Sciece The Uiversity of Texas at Austi www.cs.cmu.edu/

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS AAEC/ECON 5126 FINAL EXAM: SOLUTIONS SPRING 2015 / INSTRUCTOR: KLAUS MOELTNER This exam is ope-book, ope-otes, but please work strictly o your ow. Please make sure your ame is o every sheet you re hadig

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ), Cofidece Iterval Estimatio Problems Suppose we have a populatio with some ukow parameter(s). Example: Normal(,) ad are parameters. We eed to draw coclusios (make ifereces) about the ukow parameters. We

More information

PRACTICE PROBLEMS FOR THE FINAL

PRACTICE PROBLEMS FOR THE FINAL PRACTICE PROBLEMS FOR THE FINAL Math 36Q Fall 25 Professor Hoh Below is a list of practice questios for the Fial Exam. I would suggest also goig over the practice problems ad exams for Exam ad Exam 2 to

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett Lecture Note 8 Poit Estimators ad Poit Estimatio Methods MIT 14.30 Sprig 2006 Herma Beett Give a parameter with ukow value, the goal of poit estimatio is to use a sample to compute a umber that represets

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process. Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

x = Pr ( X (n) βx ) =

x = Pr ( X (n) βx ) = Exercise 93 / page 45 The desity of a variable X i i 1 is fx α α a For α kow let say equal to α α > fx α α x α Pr X i x < x < Usig a Pivotal Quatity: x α 1 < x < α > x α 1 ad We solve i a similar way as

More information

AMS570 Lecture Notes #2

AMS570 Lecture Notes #2 AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)

More information

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008 Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence. Cofidece Iterval 700 Samples Sample Mea 03 Cofidece Level 095 Margi of Error 0037 We wat to estimate the true mea of a radom variable X ecoomically ad with cofidece True Mea μ from the Etire Populatio

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

Confidence Intervals for the Population Proportion p

Confidence Intervals for the Population Proportion p Cofidece Itervals for the Populatio Proportio p The cocept of cofidece itervals for the populatio proportio p is the same as the oe for, the samplig distributio of the mea, x. The structure is idetical:

More information

Math 10A final exam, December 16, 2016

Math 10A final exam, December 16, 2016 Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the

More information

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED MATH 47 / SPRING 013 ASSIGNMENT : DUE FEBRUARY 4 FINALIZED Please iclude a cover sheet that provides a complete setece aswer to each the followig three questios: (a) I your opiio, what were the mai ideas

More information

Zeros of Polynomials

Zeros of Polynomials Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems

More information

Department of Mathematics

Department of Mathematics Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets

More information

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

Clases 7-8: Métodos de reducción de varianza en Monte Carlo * Clases 7-8: Métodos de reducció de variaza e Mote Carlo * 9 de septiembre de 27 Ídice. Variace reductio 2. Atithetic variates 2 2.. Example: Uiform radom variables................ 3 2.2. Example: Tail

More information

Module 1 Fundamentals in statistics

Module 1 Fundamentals in statistics Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.

More information