EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009
|
|
- Thomas Byrd
- 5 years ago
- Views:
Transcription
1 EAMINERS REPORT & SOLUTIONS STATISTICS (MATH 400) May-June 2009 Examiners Report A. Most plots were well done. Some candidates muddled hinges and quartiles and gave the wrong one. Generally candidates correctly identified F (u), but some failed to find F (k/(n + )) or got the coordinates the wrong way round. A2. Generally answered correctly. A3. Parts (i) and (ii) generally answered correctly. Some candidates incorrectly took P ( = 20) = 0.2 rather than 0., and some computed E( 2 ) rather than Var(). Part (iii) was less well done and some candidates seemed totally confused. Many candidates correctly identified the distribution of U T to be Normal, but some took it to have variance σ 2 U σ2 T rather than σ2 U + σ2 T. A4. One common error was to transpose S and σ, and incorrectly write n( µ)/s N(0, ) or ( i ) 2 /S 2 χ 2 n, in place of the correct statements n( µ)/σ N(0, ) and (i ) 2 /σ 2 χ 2 n. Otherwise both parts were generally answered correctly. A5. In Part (i) some candidates lost marks by failing to state the distribution of n( µ)/σ 0. Most candidates correctly computed the end points of the confidence interval, though one or two incorrectly used a t-distribution or even a chi-square distribution. In part (ii), most candidates correctly answered (a) and (b) correctly, but not many got (c) completely correct. B. Most candidates found parts (a) and (b) straightforward, though some candidates incorrectly defined the bias as E(θ ˆθ) (it should be E(ˆθ θ)) and some took mse(ˆθ) = Var(ˆθ)+ bias(ˆθ) 2 as the definition of the mse (it should be mse(ˆθ) E(ˆθ θ) 2 ). A small number of candidates forgot that θ here is the fixed constant value of the parameter, and tried to compute E(θ). Candidates seemed to find part (c) harder; some got as far as τ = exp{log(2)/ˆθ}, without realising that exp{log(2)} is just 2. Most candidates answered part (d), but some did not make the step from the median of each distribution (shown in the boxplot) to the mean (required for the bias and variance). B2. Part (a) was attempted by almost all candidates, and generally answered correctly. Most candidates attempted part (b) usually correctly identifying the appropriate test (paired t-test) and the correct hypotheses, finding the correct form of test statistic, and using the correct approach to computing the the p-value. However candidates did not always identify the correct model assumptions (i.e. that d,..., d n are the observed values of a random sample from the Normal distribution with unknown mean and variance); a surprising number made numerical slips in computing the variance estimate s 2 = ( d 2 i n d 2 )/(n ); some failed to state the distribution of the test statistic under H 0 ; and some failed to correctly interpret their p-value or draw appropriate conclusions. Candidates generally found part (c) harder, and many failed to relate the occurrence of, say, a type II error to a range of D values for which the probability could then be easily computed under H. B3. This question was significantly less popular than B or B2. In part (a), most students correctly computed the estimates, but very few were able to derive the equations satisfied by the least squares estimates. Candidates attempting part (b) generally identified the correct formulae for the end points of the confidence intervals, but many made numerical slips in the calculation often computing an incorrect value for sˆα even having found the correct value for ˆσ 2. Again, candidates found part (c) testing, and only a small number worked their way through to the end.
2 A. (i) The ordered observations are: 0.50, 0.66, 0.7, 0.92, 0.94,.02,.06,.2,.20,.23,.47. The values used to construct the boxplot are: the smallest observation o = 0.050; the lower hinge = median of {data values median} = ( )/2 = 0.85; the median o 6 =.02; the upper hinge = median of {data values median} = ( )/2 =.6, and the largest observation o =.47. (ii) F (x) = x/2 has inverse F (u) = 2u so the fitted quantiles are F (k/(n + )) = 2k/(n + ) for k =,..., n. The smallest observation is o = 0.50 and the corresponding fitted quantile (as n = ) is F (/2) = 0.67, so the coordinates of the point are (0.7, 0.50). The upper tail values are smaller than expected (shorter upper tails) and the lower tail values are larger than expected (shorted lower tails), so overall the data is more concentrated about its centre than expected under the fitted distribution. A2. (i) For two unknown parameters we use the equations involving the two smallest moments of the distribution. Here E( 2 ; µ, σ 2 ) = Var(; µ, σ 2 ) + [E(; µ, σ 2 )] 2 = σ 2 + µ 2. Let m = (x + + x n )/n and m 2 = (x x 2 n)/n. Then the method of moments estimators satisfy E(; ˆµ, ˆσ 2 ) = m and E( 2 ; ˆµ, ˆσ 2 ) = m 2 i.e. ˆµ = m and ˆσ2 + ˆµ 2 = m 2 (ii) Solving the first equation in (a) gives ˆµ = m = x. Substituting this into the second equation gives ˆσ 2 + m 2 = m 2, so ˆσ 2 = m 2 m 2 = x 2 i /n x 2 = (x i x) 2 /n. A3. (i) Let denote the value of a randomly chosen card. P ( = 5) = 0.6; P ( = 0) = 0.3, P ( = 20) = 0. so µ = = 8, E( 2 ) = = 85, σ 2 = E(2 ) µ 2 = 2 (ii) The customer collects n = 50 cards, so from the central limit theorem T = i N(nµ, nσ 2 ) = N(400, 050). (iii) Exactly the same argument gives U N(400, 050). Since T and U are independent, U T N(µ U µ T, σu 2 + σ2 T ) = N(0, 200) and (U T )/ 200 N(0, ). Thus P (U T > 20) = P ((U T )/ 200 > 20/ 200) = P ((U T )/ 200 > 20/ ) = P (Z > ) = P (Z < ) = = A4. (i) Let U N(0, ) and let V χ 2 r with U and V independent. Then from notes W = U/ V/r has the t distribution with r degrees of freedom and we write W t r. (ii) From notes, N(µ, σ 2 /n) so that n( µ)/σ N(0, ). Also from notes, (n )S 2 /σ 2 = i ( i ) 2 /σ 2 χ 2 n. But n( µ)/s = [ n( µ)/σ]/[ (n )S 2 /(n )σ 2 ], so n( µ)/s = U/ V/(n ), where U N(0, ) and V χ 2 n. Hence from the result in (i), n( µ)/s tn. 2
3 A5. (i) When σ is known to take value σ 0, from notes n( µ)/σ 0 N(0, ). Thus the end points of a 00( α)% confidence interval are c L = x z α/2 σ 0 / n and c U = x + z α/2 σ 0 / n, where z α/2 is such that P (Z > z α ) = α when Z N(0, ). Here n = 0, α/2 = 0.05, z 0.05 =.645, x = 5, σ 0 = 3 so c L = / 0 = 5.56 = 3.44 c U = / 0 = = 6.56 (ii) a) It would become shorter since / n decreases as n increases intuitively more observations more information smaller interval needed for same confidence level; (b) it would become larger since z α/2 increases as α decreases intuitively more confidence larger interval needed; (c) it might increase since t α/2;n > z α/2 intuitively more uncertainty over σ less information larger interval needed. However in this case the end point is also affected by the value of the estimate of σ if this was smaller than σ 0, it would tend to decrease the interval length. 3
4 B. (a) For observations from the given Pareto distribution, f(x; θ) = θ/( + x) θ+ i= log f(x; θ) = log(θ) (θ + ) log( + x) θ log f(x; θ) = log( + x) ( θ ) ( ) θ log f(x i; θ) = θ log( + x ) + + θ log( + x n) = n n θ log( + x i ) so the likelihood equation satisfied by the maximum likelihood estimate ˆθ is n n ˆθ log( + x i ) = 0 giving ˆθ = n/ log( + x i) (b) bias(ˆθ) = E(ˆθ θ), mse(ˆθ) = E(ˆθ θ) 2, and (from notes) mse(ˆθ) = Var(ˆθ)+ bias(ˆθ) 2. Here E(ˆθ) = nθ/(n ), so bias(ˆθ) = E(ˆθ θ) = E(ˆθ) θ = nθ/(n ) θ = θ/(n ). (c) You are given that ( + τ) θ = 2, so τ = τ(θ) = 2 /θ. The maximum likelihood estimate has the property that τ(θ) = τ(ˆθ) Thus τ = 2 /ˆθ = 2 log(+x i )/n (d) Despite the outliers in the boxplot for the sample median, both boxplots appear roughly symmetric about the median, possibly with some positive skew. This would imply that, in the distribution of each estimator, the median is approximately equal to, or slightly less than, the mean. The plot indicates that for each distribution, the median is close to, so for each distribution the mean is also (approximately) close to, so both estimators are approximately unbiased for τ = (though the plot seems to indicate that the median/mean of the distribution of τ mle is slightly less than, indicating a slight negative bias). However, the plot indicates that estimates based on the sample median have substantially greater variability about their median than the maximum likelihood estimates, with the hinges and whisker ends in the sample median boxplot noticeably further away from the median than in the mle plot. Since the median and the mean are not that dissimilar here, this indicates that the sample median has a greater variability about its mean. Since both estimators are unbiased, this indicates the sample median has greater variability about τ and hence greater mean square error than the mle, as an estimator of τ. 4
5 B2. (a) (i) Type I error occurs if we reject H 0 when it is in fact true; (ii) Type II error occurs if we accept H 0 when it is in fact false; (iii) The significance level of the test is P(Type I error) = P(Reject H 0 H 0 true ); (iv) The power is P(Type II error) = P(Reject H H true) = P(Accept H 0 H true). (b) A student s body fat percentage both before and after the course is likely to depend on the student as well as the effect of the course some students may have naturally high levels of body fat, others some may have naturally lower levels. However, the difference in the percentage of body fat for each student may well be independent of the difference for other subjects. Thus it is sensible to use a paired t-test based on the differences, say d i = x i y i. Model assumptions: Let i denote the percentage of body fat before the course for the ith student, let Y i denote the percentage of body fat after the course for the same student, and let D i = i Y i. Assume D,..., D 0 are a simple random sample from the N(µ, σ 2 ) distribution, where µ and σ 2 are unknown. Hypotheses: H 0 : µ = 0 versus H A : µ > 0. H 0 corresponds to a zero mean difference between the before and after body fat percentages; H corresponds to the percentages after the course being systematically smaller than before the course, so the mean difference is positive. Test Statistic: Let T = n D/ˆσ D, where here n = 0, ˆσ 2 D = S2 D = 0 i= (D i D) 2 /(0 ), so T has the t 9 distribution when H 0 is true. For the given data, the d i values have mean d =.7 and variance s 2 D = so the observed test statistic is t obs = p-value: For H A : µ > 0, the values of T which are at least as extreme as t obs are the set {T t obs }. Thus the p-value = P (T t obs H 0 true) = P (t ) = P (t ). From tables P (t 9 2.7) = and P (t 9 2.8) = , so linear interpolation gives P (t ) = (0.008) = , giving a p-value of [Not required but may replace the computation of the p-value.] Critical region: The critical region of values where an α-level test would reject H 0 has form C = {T c }, where c is defined by: α = P (Reject H 0 H 0 true) = P (T c H 0 true) = P (t 9 c ). Thus, for α = 0.05, c = t 9;0.05 =.833 giving C = {T.833}.] Conclusions: The p-value is very small, so there is very strong evidence that H 0 is not true. [ or The observed test statistic value t obs = falls well within the critical region of the 0.05-level test.] Thus we would reject H 0 in favour of H A, and conclude that the mean body fat percentages after taking the course are indeed smaller those before taking the course. (c) Under H, the power of the test is - P(Type II error) = - P(Accept H 0 H true) = P(Reject H 0 H true) = P(T >.645 H true) = - P(T <.645 H true). But when H : µ = is true, D N(, 4), so D N(, 4/0), so 0( D )/2 N(0, ) and T < D/2 <.645 D < ( D )/2 < So Power = P ( 0( D )/2 < H true) = P (Z < ) [where Z N(0, )] = Φ(0.0638) = =
6 B3. (a) The least squares estimates minimise the sum of squares (y i α βx i ) 2, so they satisfy the equations obtained by setting / α = 0 and / β = 0. The first equation gives 2( y i nˆα ˆβxi ) = 0, i.e. ȳ ˆα ˆβ x = 0, so ˆα = ȳ ˆβ x. The second eqn gives 2( y i x i ˆαx i ˆβx 2 i ) = 0, i.e. y i x i ˆαn x ˆβ x 2 i = 0. Substituting in for ˆα gives y i x i (ȳ ˆβ x)n x ˆβ x 2 i = 0, and rearranging gives ˆβ( x 2 i n x 2 ) = y i x i nȳ x, so ˆβ = ( y i x i nȳ x)/( x 2 i n x 2 ) or equivalently ˆβ = ( y i x i y i xi /n)/( x 2 i ( x i ) 2 /n). Here x i = 30; y i = 73.62; x 2 i = 0; y 2 i = ; y i x i = giving: ˆβ = ( /0)/( /0) =.672, ˆα = y i /n ˆβ x i /n = 73.62/ /0 = The definitions of x i and y i were inadvertently swapped at some point when the question was being revised, so times became distances and vice versa. The intended answer was that the estimated time y to reach an address a distance x from the base is y = ˆα + ˆβx, so the estimated time taken to reach an address 2.5 miles from the base is ˆα ˆβ = However, as printed, x denoted the time and y = 2.5 the distance, so the estimated time became x = (y ˆα)/ ˆβ = Both answers were awarded full marks. (b) From the R output we can read off that the estimate of σ is and the corresponding value of sˆα is Alternatively, ˆσ 2 = [ y 2 i nȳ 2 ( x i y i n xȳ) 2 /( x 2 i n x 2 )]/(n 2] = giving s 2ˆα = ˆσ2 (/n + x 2 / (x i x) 2 ) = and sˆα = You are given that (ˆα α)/sˆα has the t-distribution with n 2 degrees of freedom, so the end points (c L, c U ) of a 00( γ)% confidence interval for α are given by c L = ˆα t n 2;γ/2 sˆα and c U = ˆα + t n 2;γ/2 sˆα. For a 95% confidence interval, γ/2 = and from tables t 8;0.025 = Thus the 95% confidence interval for α has end points c L = =.4996 c U = = (c) For fixed x, you are given that ˆα + ˆβx can be expressed in the form ˆα + ˆβx = Now and n Y i (/n + b i (x x)) where b i = (x i x)/s xx and s xx = (x i x) 2. i= b i = (x i x)/s xx = 0/s xx = 0 b2 i = (x i x) 2 /(s xx ) 2 = s xx /(s xx ) 2 = /s xx. Since the Y i are independent and each has the same variance σ 2, we have Var(ˆα + ˆβx) = Var( Y i (/n + b i (x x)) = Var(Y i (/n + b i (x x)) = Var(Y i )(/n + b i (x x)) 2 = σ 2 (/n + b i (x x)) 2 = σ 2 (/n 2 + 2b i (x x)/n + b 2 i (x x) 2 ) = σ 2 [ /n 2 + 2(x i x)( b i )/n + (x x) 2 b 2 i ] = σ 2 [/n + (x x) 2 /s xx ] since b i = 0 and b 2 i = /s xx. 6
Bias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationStatistics and Econometrics I
Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,
More informationStatistics II Lesson 1. Inference on one population. Year 2009/10
Statistics II Lesson 1. Inference on one population Year 2009/10 Lesson 1. Inference on one population Contents Introduction to inference Point estimators The estimation of the mean and variance Estimating
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationSome Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2
STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square
More informationTheory of Statistics.
Theory of Statistics. Homework V February 5, 00. MT 8.7.c When σ is known, ˆµ = X is an unbiased estimator for µ. If you can show that its variance attains the Cramer-Rao lower bound, then no other unbiased
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationEconometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11
Econometrics A Keio University, Faculty of Economics Simple linear model (2) Simon Clinet (Keio University) Econometrics A October 16, 2018 1 / 11 Estimation of the noise variance σ 2 In practice σ 2 too
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationTerminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1
Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium November 12, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationSimple Linear Regression
Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring
More informationThis paper is not to be removed from the Examination Halls
~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,
More informationPreliminary Statistics Lecture 5: Hypothesis Testing (Outline)
1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 5: Hypothesis Testing (Outline) Gujarati D. Basic Econometrics, Appendix A.8 Barrow M. Statistics
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationTime: 1 hour 30 minutes
Paper Reference(s) 6684/01 Edexcel GCE Statistics S2 Gold Level G3 Time: 1 hour 30 minutes Materials required for examination papers Mathematical Formulae (Green) Items included with question Nil Candidates
More informationProblem 1 (20) Log-normal. f(x) Cauchy
ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5
More informationz and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests
z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationMTMS Mathematical Statistics
MTMS.01.099 Mathematical Statistics Lecture 12. Hypothesis testing. Power function. Approximation of Normal distribution and application to Binomial distribution Tõnu Kollo Fall 2016 Hypothesis Testing
More informationChapters 9. Properties of Point Estimators
Chapters 9. Properties of Point Estimators Recap Target parameter, or population parameter θ. Population distribution f(x; θ). { probability function, discrete case f(x; θ) = density, continuous case The
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationMVE055/MSG Lecture 8
MVE055/MSG810 2017 Lecture 8 Petter Mostad Chalmers September 23, 2017 The Central Limit Theorem (CLT) Assume X 1,..., X n is a random sample from a distribution with expectation µ and variance σ 2. Then,
More informationChapter 8 - Statistical intervals for a single sample
Chapter 8 - Statistical intervals for a single sample 8-1 Introduction In statistics, no quantity estimated from data is known for certain. All estimated quantities have probability distributions of their
More informationIntroduction to Statistical Inference
Introduction to Statistical Inference Dr. Fatima Sanchez-Cabo f.sanchezcabo@tugraz.at http://www.genome.tugraz.at Institute for Genomics and Bioinformatics, Graz University of Technology, Austria Introduction
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationAdvanced Statistics I : Gaussian Linear Model (and beyond)
Advanced Statistics I : Gaussian Linear Model (and beyond) Aurélien Garivier CNRS / Telecom ParisTech Centrale Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More informationBootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.
Bootstrap G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu Director of Center for Astrostatistics http://astrostatistics.psu.edu Outline 1 Motivation 2 Simple statistical problem 3 Resampling
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationStatistical Inference
Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will
More informationQuantitative Methods for Economics, Finance and Management (A86050 F86050)
Quantitative Methods for Economics, Finance and Management (A86050 F86050) Matteo Manera matteo.manera@unimib.it Marzio Galeotti marzio.galeotti@unimi.it 1 This material is taken and adapted from Guy Judge
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationMATH11400 Statistics Homepage
MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 4. Linear Regression 4.1 Introduction So far our data have consisted of observations on a single variable of interest.
More informationThe t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary
Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis
More informationNull Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017
Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2017 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test statistic f (x H 0
More informationHypothesis tests
6.1 6.4 Hypothesis tests Prof. Tesler Math 186 February 26, 2014 Prof. Tesler 6.1 6.4 Hypothesis tests Math 186 / February 26, 2014 1 / 41 6.1 6.2 Intro to hypothesis tests and decision rules Hypothesis
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationEstimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1
Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1
More informationStatistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018
Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationSpace Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses
Space Telescope Science Institute statistics mini-course October 2011 Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses James L Rosenberger Acknowledgements: Donald Richards, William
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More information1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).
Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,
More informationChapter 5: HYPOTHESIS TESTING
MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate
More informationWISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A
WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This
More information7.1 Basic Properties of Confidence Intervals
7.1 Basic Properties of Confidence Intervals What s Missing in a Point Just a single estimate What we need: how reliable it is Estimate? No idea how reliable this estimate is some measure of the variability
More informationLecture 10: Generalized likelihood ratio test
Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationWeek 9 The Central Limit Theorem and Estimation Concepts
Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationMark Scheme 4766 June 2005 Statistics (4766) Qn Answer Mk Comment (i) Mean = 657/20 = 32.85 cao 2 (i) 657 2 Variance = (22839 - ) = 66.3 9 20 Standard deviation = 8.3 32.85 + 2(8.3) = 49. none of the 3
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More information1 Hypothesis testing for a single mean
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationA Simulation Comparison Study for Estimating the Process Capability Index C pm with Asymmetric Tolerances
Available online at ijims.ms.tku.edu.tw/list.asp International Journal of Information and Management Sciences 20 (2009), 243-253 A Simulation Comparison Study for Estimating the Process Capability Index
More informationOne-Sample Numerical Data
One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationMeasures of center. The mean The mean of a distribution is the arithmetic average of the observations:
Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number
More informationStatistics Ph.D. Qualifying Exam: Part II November 3, 2001
Statistics Ph.D. Qualifying Exam: Part II November 3, 2001 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your
More informationUQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables
UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1
More informationMATH c UNIVERSITY OF LEEDS Examination for the Module MATH1725 (May-June 2009) INTRODUCTION TO STATISTICS. Time allowed: 2 hours
01 This question paper consists of 11 printed pages, each of which is identified by the reference. Only approved basic scientific calculators may be used. Statistical tables are provided at the end of
More informationA booklet Mathematical Formulae and Statistical Tables might be needed for some questions.
Paper Reference(s) 6663/0 Edexcel GCE Core Mathematics C Advanced Subsidiary Inequalities Calculators may NOT be used for these questions. Information for Candidates A booklet Mathematical Formulae and
More information0.1 The plug-in principle for finding estimators
The Bootstrap.1 The plug-in principle for finding estimators Under a parametric model P = {P θ ;θ Θ} (or a non-parametric P = {P F ;F F}), any real-valued characteristic τ of a particular member P θ (or
More informationNormal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT):
Lecture Three Normal theory null distributions Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT): A random variable which is a sum of many
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More informationINTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y
INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved
More informationM(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1
Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)
More informationInterval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean
Interval estimation October 3, 2018 STAT 151 Class 7 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0
More informationRegression #3: Properties of OLS Estimator
Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationIntroduction to Statistics
MTH4106 Introduction to Statistics Notes 15 Spring 2013 Testing hypotheses about the mean Earlier, we saw how to test hypotheses about a proportion, using properties of the Binomial distribution It is
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More information