Fin285a:Computer Simulations and Risk Assessment Section 2.3.2:Hypothesis testing, and Confidence Intervals

Size: px
Start display at page:

Download "Fin285a:Computer Simulations and Risk Assessment Section 2.3.2:Hypothesis testing, and Confidence Intervals"

Transcription

1 Fin285a:Computer Simulations and Risk Assessment Section 2.3.2:Hypothesis testing, and Confidence Intervals

2 Overview Hypothesis testing terms Testing a die Testing issues Estimating means Confidence intervals Fall 2017: LeBaron Fin285a: / 51

3 Preview Correct models/parameters? Data similar? Has something changed in the data? Use one series to predict another Confidence intervals How good are parameter estimates? Closely related to hypothesis testing Fall 2017: LeBaron Fin285a: / 51

4 Simulation methodology Get statistics from real data Simulate model Compare model simulations to real data Could statistics (random variables) be draws from model? Fall 2017: LeBaron Fin285a: / 51

5 Hypothesis testing terms Testing a die Testing issues Estimating means Confidence intervals Hypothesis testing terms Fall 2017: LeBaron Fin285a: / 51

6 Null hypothesis Null hypothesis Assumption about how the world works Assume this is true Could data have come from this machine/theory/conjecture? Do you need more/other data? Fall 2017: LeBaron Fin285a: / 51

7 More terms Test statistic Observed statistic (random variable) p-value Probability of observing a given test statistic from the null hypothesis Fall 2017: LeBaron Fin285a: / 51

8 More terms Test statistic Observed statistic (random variable) p-value Probability of observing a given test statistic from the null hypothesis Example Trading profits = 100 (test statistic) Given a random walk for prices (null hypothesis) Probability(profits >= 100) = 0.25 (p-value) Fall 2017: LeBaron Fin285a: / 51

9 Three interrelated concepts p-values Hypothesis tests/critical values Confidence intervals Start with histogram from a null hypothesis, and a test statistic (next slide). Fall 2017: LeBaron Fin285a: / 51

10 Test statistic and null distribution Frequency X Unusual?? Fall 2017: LeBaron Fin285a: / 51

11 Test statistic and null distribution Frequency X Unusual?? Pr(X > 1.3) = 0.10 Fall 2017: LeBaron Fin285a: / 51

12 Probability questions you can ask Pr(X > t) Probability that the null hypothesis gives a value larger than the test statistic Fall 2017: LeBaron Fin285a: / 51

13 Probability questions you can ask Pr(X > t) Probability that the null hypothesis gives a value larger than the test statistic Pr(X < t) Probability that the null hypothesis gives a value smaller than the test statistic Fall 2017: LeBaron Fin285a: / 51

14 Probability questions you can ask Pr(X > t) Probability that the null hypothesis gives a value larger than the test statistic Pr(X < t) Probability that the null hypothesis gives a value smaller than the test statistic Pr( X k > t k ) Probability that the null hypothesis gives a value farther from k than the test statistic These are all a form of p-value Fall 2017: LeBaron Fin285a: / 51

15 Hypothesis test Test whether test statistic could have come from null data generator Answer: reject or cannot reject null hypothesis Test usually involves some critical value, C Fall 2017: LeBaron Fin285a: / 51

16 Hypothesis test Test whether test statistic could have come from null data generator Answer: reject or cannot reject null hypothesis Test usually involves some critical value, C Reject null hypothesis when, t > C, or (one tailed test) t < C,or (one tailed test) t k > C (two tailed test) Fall 2017: LeBaron Fin285a: / 51

17 Critical value (one tail) Frequency X Example: reject null if t > C(t > 1.3). Probability of rejecting, when null is true is Pr(X > C) = 0.1 Fall 2017: LeBaron Fin285a: / 51

18 Critical value (two tail) Frequency X Reject null if t > C. Probability of rejecting, when null is true is Pr( X > C) = 0.05 (area in each tail is 0.025) Fall 2017: LeBaron Fin285a: / 51

19 One versus two tailed tests Depends on the spirit of the question and the alternative models you are thinking about Think about a sample mean as an example You have two samples, and the estimated mean has changed from t to s. If you are asking if it could have increased by as much as you saw in the data, then a one tail test, t s > C is probably in order. If you are asking if it could have decreased by as much as you saw in the data, then a one tail test, t s < C is probably in order If you are asking if it could have changed by as much as you saw in the data, then a two tail test, t s > C is probably in order Fall 2017: LeBaron Fin285a: / 51

20 Hypothesis testing terms Testing a die Testing issues Estimating means Confidence intervals Testing a die Fall 2017: LeBaron Fin285a: / 51

21 Testing a die You ve observed the following rolls of a die out of 6000 rolls Could this have come from a fair die with prob 1/6 on each side? Fall 2017: LeBaron Fin285a: / 51

22 Dietest.m 1.Think up a test statistic 2.Roll 6000 dies with sample 3.Check how the value of the test statistic from the original data compares with the distribution from the simulations 4.Python: dietest.py Fall 2017: LeBaron Fin285a: / 51

23 A Bayesian moment What if you want to assess the probability of different types of biased coins given some data Pr(die data) For this you will need other tools Most likely Bayesian statistical methods Classical stats often involves precise testing of a somewhat narrow null hypothesis Fall 2017: LeBaron Fin285a: / 51

24 Hypothesis testing terms Testing a die Testing issues Estimating means Confidence intervals Testing issues Fall 2017: LeBaron Fin285a: / 51

25 Size and power Prob(reject null true) = size of test = Type I error Prob(reject null is false) = power of test Prob(not reject null is false) = Type II error = (1-power) Fall 2017: LeBaron Fin285a: / 51

26 Mushrooms and toadstools Test for toadstool (poisonous) Null = mushroom Reject (don t eat) if test statistic rejects Goal: eat mushrooms, throw out toadstools Fall 2017: LeBaron Fin285a: / 51

27 Mushrooms and toadstools Test for toadstool (poisonous) Null = mushroom Reject (don t eat) if test statistic rejects Goal: eat mushrooms, throw out toadstools Type I error: Probability of throwing out good mushrooms: Prob(reject null true) Fall 2017: LeBaron Fin285a: / 51

28 Mushrooms and toadstools Test for toadstool (poisonous) Null = mushroom Reject (don t eat) if test statistic rejects Goal: eat mushrooms, throw out toadstools Type I error: Probability of throwing out good mushrooms: Prob(reject null true) Power: probability of throwing out toadstools: Prob(reject null is false) Fall 2017: LeBaron Fin285a: / 51

29 Mushrooms and toadstools Test for toadstool (poisonous) Null = mushroom Reject (don t eat) if test statistic rejects Goal: eat mushrooms, throw out toadstools Type I error: Probability of throwing out good mushrooms: Prob(reject null true) Power: probability of throwing out toadstools: Prob(reject null is false) Type II error: Probability of accepting (eating) a toadstool: Prob(not reject null is false) Fall 2017: LeBaron Fin285a: / 51

30 Hypothesis testing terms Testing a die Testing issues Estimating means Confidence intervals Estimating means Fall 2017: LeBaron Fin285a: / 51

31 Estimate mean for long run stock returns Long range U.S. data ( ) Annual returns (with dividends) Real returns (inflation adjusted) Fall 2017: LeBaron Fin285a: / 51

32 Goals of this example Move from p-values, critical values, to confidence intervals Compare analytic, monte-carlo, and bootstrap approaches Fall 2017: LeBaron Fin285a: / 51

33 Sample statistics ˆσ 2 = 1 T 1 ˆθ = 1 T T t=1 R t ( ) T (R t ˆθ) 2 ( ) t=1 θ = E(R t ), σ 2 = E(R t θ) 2 ( ) Fall 2017: LeBaron Fin285a: / 51

34 Basic analytic tests (t-test) Z = ˆθ θ ˆσ/ T ( ) Assume that ˆθ is normal. Z is distributed with a student-t distribution with T 1 degrees of freedom. (t T 1 ) Null, θ = 0.06 Z = 1.82 Pr(t T 1 > Z) = 0.035, this is the p-value Probability that it came from this distribution (θ = 0.06) is small, but not impossible What about θ = 0.05? Z = 2.60 Pr(t T 1 > Z) = Fall 2017: LeBaron Fin285a: / 51

35 Monte-carlo test Assume normal for R t Assume null of θ = 0.06 (population) Set σ 2 to sample estimate, ˆσ 2 Generate monte-carlo, ˆθ mc, for many draws of T length samples Compare ˆθ to this computer generated distribution Pr(ˆθ mc > ˆθ) Fall 2017: LeBaron Fin285a: / 51

36 Bootstrap test Assume R t is the population Data readjust (need to shift R t to null) R a t = R t ˆθ+0.06 Adjust to new series with θ = E(R a t) = 0.06 (population) Remove the population mean for bootstrap, ˆθ, add null hypothesis, 0.06 Redraw new samples of length T, with probability 1/T on each R a t (with replacement, many (B) times) Store each estimated mean ˆθ b Compare ˆθ to this computer generated distribution Pr(ˆθ b > ˆθ) Fall 2017: LeBaron Fin285a: / 51

37 Bootstrap t-test(1) Redraw new samples of length T, with probability 1/T on each R t (with replacement, many (B) times) Estimate the student-t test statistic, Z b, for each bootstrap sample Z b = ˆθ b ˆθ ˆσ b / T ( ) ˆθb and ˆσ b are both estimated for each bootstrap sample drawn from the original set of returns (which represents the population) Fall 2017: LeBaron Fin285a: / 51

38 Bootstrap t-test(2) Z b = ˆθ b ˆθ ˆσ b / T, Z = ˆθ θ ˆσ/ T Store each value Z b Compare Z to this computer generated distribution Pr(Z b > Z) Remember that in bootstrapping, the population (urn) is R t (the sample), and E(R t ) = ˆθ = 1 T ( T t=1 R t) Bootstrap mantra : population = sample (1/T) Fall 2017: LeBaron Fin285a: / 51

39 Python code annualmean.py Performs all 3 tests Fall 2017: LeBaron Fin285a: / 51

40 Mean difference Estimate mean annual real returns across history : : This is an increase of Is this significant/interesting? Use bootstrap to simulate equal expected return (θ) null hypothesis Fall 2017: LeBaron Fin285a: / 51

41 Mean difference: bootstrap Python: annualmeandiff.py ˆθi is the estimated mean for each part i = (1,2) Estimate mean differences ˆd = ˆθ 2 ˆθ 1 Bootstrap technique: Assume entire set of returns is the population (same over time) Draw fake samples i = 1 and i = 2 of appropriate length from the entire sample Estimate mean difference ˆd b = ˆθ b 2 ˆθ b 1 Compare ˆd to ˆd b : Pr(ˆd b > ˆd) Fall 2017: LeBaron Fin285a: / 51

42 Hypothesis testing terms Testing a die Testing issues Estimating means Confidence intervals Confidence intervals Fall 2017: LeBaron Fin285a: / 51

43 Confidence intervals Regions which contain true parameters Show uncertainty about our estimates First experiment: (monte-carlo) Assume annual returns are normal at estimated mean and std. from the data θ = ˆθ, σ = ˆσ Simulate sample of T normal returns Estimate mean, ˆθ, in each sample, and plot distribution How much does it vary around true value? Fall 2017: LeBaron Fin285a: / 51

44 Normal monte-carlo Frequency Estimated mean stock return Distribution with and quantiles. Fall 2017: LeBaron Fin285a: / 51

45 Statistics reminder This simulation is useful, but You should remember that E(ˆθ) = θ, Std.(ˆθ) = σ θ = σ T ˆθ follows a normal distribution, N(θ,σθ 2) 95% of the distribution for ˆθ lies within [θ 1.96σ θ,θ+1.96σ θ ] [θ +Φ 1 (0.025)σ θ,θ+φ 1 (0.975)σ θ ] Φ(x) is the cumulative distribution function for a standard normal N(0,1). Φ 1 (p) is the p-quantile for N(0,1). Fall 2017: LeBaron Fin285a: / 51

46 Location for estimates If the true value of θ = then, The estimated mean from various samples of length T will lie within [0.05,0.10] with probability 0.95 This is nice, but not quite what we want We do know that with probability 0.95 the true expected value, θ, will be within 1.96σ θ of ˆθ, our estimated mean Pr( ˆθ θ < 1.96σ θ ) = 0.95 Fall 2017: LeBaron Fin285a: / 51

47 What we really want Pr( θ ˆθ < 1.96σ θ ) = 0.95 Define the region A = [ˆθ 1.96σ θ,ˆθ +1.96σ θ ] Probability that A covers θ is α = 0.95 Replace σ θ with sample estimate, ˆσ θ = ˆσ/ T = 0.17/ T This is your typical confidence band around the estimate, ˆθ, A = [ , ] Fall 2017: LeBaron Fin285a: / 51

48 Moving confidence region h h h h θ ˆθ The region is about the distribution around θ Since we are interested in distances from ˆθ we simply pick [θ h,θ +h] up and move it to ˆθ Fall 2017: LeBaron Fin285a: / 51

49 Connection to hypothesis tests Remember the two-sided test Pr( ˆθ θ > C 0.05 ) = 0.05 Estimate the critical value C for this (usually don t need θ ) Then find all values of θ where ˆθ θ C 0.05 These would also be the 0.95 confidence region or, all values of θ where the null hypothesis of E(R t ) = θ is not rejected at the 0.05 level Fall 2017: LeBaron Fin285a: / 51

50 Bootstrap confidence intervals Get bootstrap distribution of statistic, ˆθ b,b = 1,...,B Two methods: 1.Normal bands Use the bootstrap to estimate σ θ = std(ˆθ b ), Then use standard normal distribution bands [ˆθ +Φ 1 (α) σ θ,ˆθ +Φ 1 (1 α) σ θ ] Φ(x) is the cumulative distribution function for a standard normal 2.Percentile method Use bootstrap values, ˆθ b, to estimate distribution for ˆθ Then get quantiles for this [q α (ˆθ b ),q 1 α (ˆθ b )] Fall 2017: LeBaron Fin285a: / 51

51 Bootstrap mean Each draw: (x 1,x 2,...,x T ) (x b 1,x b 2,...,x b T ) Probability: ( 1 T, 1 T,..., 1 T ) Do this b = 1,2,...,B times T ˆθ b = 1 T x b t, b = 1,...,B t=1 E(x b t) = 1 T x T x T x T = ˆθ E(ˆθ b ) = 1 T T E(x b t ) = 1 T T ˆθ = ˆθ t=1 t=1 Fall 2017: LeBaron Fin285a: / 51

52 Bootstrap mean E(ˆθ b ) = 1 T T E(x b t) = 1 T T ˆθ = ˆθ t=1 t=1 What does this say? Bootstrap for mean is centered around sample mean Doesn t help get a better point estimate This is true for many (not all) statistics Fall 2017: LeBaron Fin285a: / 51

53 Bootstrap variance σ 2 θ = var(ˆθ b ) = 1 B 1 θ = 1 B B B (ˆθ b θ) 2 b=1 ˆθ b σ θ = b=1 σ 2 θ = std(ˆθ b ) Fall 2017: LeBaron Fin285a: / 51

54 Bootstrap mean confidence interval Frequency Estimated mean stock return Bootstrap distribution (ˆθ b ) with and quantiles. Fall 2017: LeBaron Fin285a: / 51

55 More on the Percentile Method Bootstrap distribution is centered at sample value, why? Population = sample, E(ˆθ b ) = E(R b t) = ˆθ for bootstrap What is going on? Assume bootstrap distribution of ˆθ b centered at ˆθ is the same as ˆθ centered around θ We pick it up and move it to ˆθ as we did in the analytic case Percentile bootstrap does this automatically Assumptions Need symmetric distribution for ˆσ Do NOT need normal distribution for anything Python code: annualmeanconf.py Fall 2017: LeBaron Fin285a: / 51

56 More on Bootstraps, and B B usually needs to be pretty large Depends a little on what you need For std. B around can be ok For small quantiles α = need large B = 100, 000 This might be why you prefer normal (method 1) confidence bands Fall 2017: LeBaron Fin285a: / 51

57 What about asymmetric distributions for ˆθ? Gets complicated Many methods (no dominant method) Most statistics we will look at will be symmetric Fall 2017: LeBaron Fin285a: / 51

58 Why bother with bootstrap? This example designed to be familiar (mean) Normal approximations look good Why bootstrap? Deviations from normality Statistics have no analytics Analytics might be difficult or time consuming Fall 2017: LeBaron Fin285a: / 51

59 Overview Hypothesis testing terms Testing a die Testing issues Estimating means Confidence intervals Fall 2017: LeBaron Fin285a: / 51

Introductory Econometrics. Review of statistics (Part II: Inference)

Introductory Econometrics. Review of statistics (Part II: Inference) Introductory Econometrics Review of statistics (Part II: Inference) Jun Ma School of Economics Renmin University of China October 1, 2018 1/16 Null and alternative hypotheses Usually, we have two competing

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate

More information

Fin285a:Computer Simulations and Risk Assessment Section 6.2 Extreme Value Theory Daníelson, 9 (skim), skip 9.5

Fin285a:Computer Simulations and Risk Assessment Section 6.2 Extreme Value Theory Daníelson, 9 (skim), skip 9.5 Fin285a:Computer Simulations and Risk Assessment Section 6.2 Extreme Value Theory Daníelson, 9 (skim), skip 9.5 Overview Extreme value distributions Generalized Pareto distributions Tail shapes Using power

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

STAT 4385 Topic 01: Introduction & Review

STAT 4385 Topic 01: Introduction & Review STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1 Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Data Mining by I. H. Witten and E. Frank Predicting performance Assume the estimated error rate is 5%. How close is this to the true error rate? Depends on the amount of test data Prediction

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 Data Analysis: The mean egg masses (g) of the two different types of eggs may be exactly the same, in which case you may be tempted to accept

More information

18.05 Practice Final Exam

18.05 Practice Final Exam No calculators. 18.05 Practice Final Exam Number of problems 16 concept questions, 16 problems. Simplifying expressions Unless asked to explicitly, you don t need to simplify complicated expressions. For

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28 Separate vs. paired samples Despite the fact that paired samples usually offer

More information

Lecture 30. DATA 8 Summer Regression Inference

Lecture 30. DATA 8 Summer Regression Inference DATA 8 Summer 2018 Lecture 30 Regression Inference Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Contributions by Fahad Kamran (fhdkmrn@berkeley.edu) and

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Introduction to Statistical Inference Dr. Fatima Sanchez-Cabo f.sanchezcabo@tugraz.at http://www.genome.tugraz.at Institute for Genomics and Bioinformatics, Graz University of Technology, Austria Introduction

More information

Preliminary Statistics Lecture 5: Hypothesis Testing (Outline)

Preliminary Statistics Lecture 5: Hypothesis Testing (Outline) 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 5: Hypothesis Testing (Outline) Gujarati D. Basic Econometrics, Appendix A.8 Barrow M. Statistics

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What

More information

X = X X n, + X 2

X = X X n, + X 2 CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk

More information

Confidence intervals CE 311S

Confidence intervals CE 311S CE 311S PREVIEW OF STATISTICS The first part of the class was about probability. P(H) = 0.5 P(T) = 0.5 HTTHHTTTTHHTHTHH If we know how a random process works, what will we see in the field? Preview of

More information

Review. December 4 th, Review

Review. December 4 th, Review December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

CS 160: Lecture 16. Quantitative Studies. Outline. Random variables and trials. Random variables. Qualitative vs. Quantitative Studies

CS 160: Lecture 16. Quantitative Studies. Outline. Random variables and trials. Random variables. Qualitative vs. Quantitative Studies Qualitative vs. Quantitative Studies CS 160: Lecture 16 Professor John Canny Qualitative: What we ve been doing so far: * Contextual Inquiry: trying to understand user s tasks and their conceptual model.

More information

Homework 1 Solutions

Homework 1 Solutions Homework 1 Solutions January 18, 2012 Contents 1 Normal Probability Calculations 2 2 Stereo System (SLR) 2 3 Match Histograms 3 4 Match Scatter Plots 4 5 Housing (SLR) 4 6 Shock Absorber (SLR) 5 7 Participation

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008 MIT OpenCourseWare http://ocw.mit.edu 2.830J / 6.780J / ESD.63J Control of Processes (SMA 6303) Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Frequentist Statistics and Hypothesis Testing Spring

Frequentist Statistics and Hypothesis Testing Spring Frequentist Statistics and Hypothesis Testing 18.05 Spring 2018 http://xkcd.com/539/ Agenda Introduction to the frequentist way of life. What is a statistic? NHST ingredients; rejection regions Simple

More information

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Extreme Value Theory.

Extreme Value Theory. Bank of England Centre for Central Banking Studies CEMLA 2013 Extreme Value Theory. David G. Barr November 21, 2013 Any views expressed are those of the author and not necessarily those of the Bank of

More information

Confidence Interval Estimation

Confidence Interval Estimation Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 4 5 Relationship to the 2-Tailed Hypothesis Test Relationship to the 1-Tailed Hypothesis Test 6 7 Introduction In

More information

Probability & Statistics - FALL 2008 FINAL EXAM

Probability & Statistics - FALL 2008 FINAL EXAM 550.3 Probability & Statistics - FALL 008 FINAL EXAM NAME. An urn contains white marbles and 8 red marbles. A marble is drawn at random from the urn 00 times with replacement. Which of the following is

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Class 7 AMS-UCSC Tue 31, 2012 Winter 2012. Session 1 (Class 7) AMS-132/206 Tue 31, 2012 1 / 13 Topics Topics We will talk about... 1 Hypothesis testing

More information

STAT 830 Non-parametric Inference Basics

STAT 830 Non-parametric Inference Basics STAT 830 Non-parametric Inference Basics Richard Lockhart Simon Fraser University STAT 801=830 Fall 2012 Richard Lockhart (Simon Fraser University)STAT 830 Non-parametric Inference Basics STAT 801=830

More information

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1 Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females

More information

P (A) = P (B) = P (C) = P (D) =

P (A) = P (B) = P (C) = P (D) = STAT 145 CHAPTER 12 - PROBABILITY - STUDENT VERSION The probability of a random event, is the proportion of times the event will occur in a large number of repititions. For example, when flipping a coin,

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Probabilities & Statistics Revision

Probabilities & Statistics Revision Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF

More information

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means

More information

Exam 2 Practice Questions, 18.05, Spring 2014

Exam 2 Practice Questions, 18.05, Spring 2014 Exam 2 Practice Questions, 18.05, Spring 2014 Note: This is a set of practice problems for exam 2. The actual exam will be much shorter. Within each section we ve arranged the problems roughly in order

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

MTMS Mathematical Statistics

MTMS Mathematical Statistics MTMS.01.099 Mathematical Statistics Lecture 12. Hypothesis testing. Power function. Approximation of Normal distribution and application to Binomial distribution Tõnu Kollo Fall 2016 Hypothesis Testing

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Discrete Random Variables

Discrete Random Variables Discrete Random Variables We have a probability space (S, Pr). A random variable is a function X : S V (X ) for some set V (X ). In this discussion, we must have V (X ) is the real numbers X induces a

More information

Quantitative Techniques - Lecture 8: Estimation

Quantitative Techniques - Lecture 8: Estimation Quantitative Techniques - Lecture 8: Estimation Key words: Estimation, hypothesis testing, bias, e ciency, least squares Hypothesis testing when the population variance is not known roperties of estimates

More information

Review of probability and statistics 1 / 31

Review of probability and statistics 1 / 31 Review of probability and statistics 1 / 31 2 / 31 Why? This chapter follows Stock and Watson (all graphs are from Stock and Watson). You may as well refer to the appendix in Wooldridge or any other introduction

More information

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1 PHP2510: Principles of Biostatistics & Data Analysis Lecture X: Hypothesis testing PHP 2510 Lec 10: Hypothesis testing 1 In previous lectures we have encountered problems of estimating an unknown population

More information

Last few slides from last time

Last few slides from last time Last few slides from last time Example 3: What is the probability that p will fall in a certain range, given p? Flip a coin 50 times. If the coin is fair (p=0.5), what is the probability of getting an

More information

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments The hypothesis testing framework The two-sample t-test Checking assumptions, validity Comparing more that

More information

STAT 111 Recitation 9

STAT 111 Recitation 9 STAT 111 Recitation 9 Linjun Zhang November 10, 2017 Hypothesis Testing Basic concepts: H 0 (null hypothesis), H 1 (alternative hypothesis) 1 Hypothesis Testing Basic concepts: H 0 (null hypothesis), H

More information

STA 2101/442 Assignment 2 1

STA 2101/442 Assignment 2 1 STA 2101/442 Assignment 2 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. A polling firm plans to ask a random sample of registered voters in Quebec whether

More information

(1) Introduction to Bayesian statistics

(1) Introduction to Bayesian statistics Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they

More information

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions Part 1: Probability Distributions Purposes of Data Analysis True Distributions or Relationships in the Earths System Probability Distribution Normal Distribution Student-t Distribution Chi Square Distribution

More information

Regression Estimation Least Squares and Maximum Likelihood

Regression Estimation Least Squares and Maximum Likelihood Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize

More information

Study and research skills 2009 Duncan Golicher. and Adrian Newton. Last draft 11/24/2008

Study and research skills 2009 Duncan Golicher. and Adrian Newton. Last draft 11/24/2008 Study and research skills 2009. and Adrian Newton. Last draft 11/24/2008 Inference about the mean: What you will learn Why we need to draw inferences from samples The difference between a population and

More information

Better Bootstrap Confidence Intervals

Better Bootstrap Confidence Intervals by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture VII (26.11.07) Contents: Maximum Likelihood (II) Exercise: Quality of Estimators Assume hight of students is Gaussian distributed. You measure the size of N students.

More information

Introduction to Econometrics. Review of Probability & Statistics

Introduction to Econometrics. Review of Probability & Statistics 1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical

More information

Inference in Regression Model

Inference in Regression Model Inference in Regression Model Christopher Taber Department of Economics University of Wisconsin-Madison March 25, 2009 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3

More information

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

Section 2: Estimation, Confidence Intervals and Testing Hypothesis Section 2: Estimation, Confidence Intervals and Testing Hypothesis Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/

More information

What is a random variable

What is a random variable OKAN UNIVERSITY FACULTY OF ENGINEERING AND ARCHITECTURE MATH 256 Probability and Random Processes 04 Random Variables Fall 20 Yrd. Doç. Dr. Didem Kivanc Tureli didemk@ieee.org didem.kivanc@okan.edu.tr

More information

Bayesian Inference for Normal Mean

Bayesian Inference for Normal Mean Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table. MA 1125 Lecture 15 - The Standard Normal Distribution Friday, October 6, 2017. Objectives: Introduce the standard normal distribution and table. 1. The Standard Normal Distribution We ve been looking at

More information

UCLA STAT 251. Statistical Methods for the Life and Health Sciences. Hypothesis Testing. Instructor: Ivo Dinov,

UCLA STAT 251. Statistical Methods for the Life and Health Sciences. Hypothesis Testing. Instructor: Ivo Dinov, UCLA STAT 251 Statistical Methods for the Life and Health Sciences Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology University of California, Los Angeles, Winter 22 http://www.stat.ucla.edu/~dinov/

More information

STA Module 4 Probability Concepts. Rev.F08 1

STA Module 4 Probability Concepts. Rev.F08 1 STA 2023 Module 4 Probability Concepts Rev.F08 1 Learning Objectives Upon completing this module, you should be able to: 1. Compute probabilities for experiments having equally likely outcomes. 2. Interpret

More information

SPRING 2007 EXAM C SOLUTIONS

SPRING 2007 EXAM C SOLUTIONS SPRING 007 EXAM C SOLUTIONS Question #1 The data are already shifted (have had the policy limit and the deductible of 50 applied). The two 350 payments are censored. Thus the likelihood function is L =

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

Difference between means - t-test /25

Difference between means - t-test /25 Difference between means - t-test 1 Discussion Question p492 Ex 9-4 p492 1-3, 6-8, 12 Assume all variances are not equal. Ignore the test for variance. 2 Students will perform hypothesis tests for two

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

STA Module 10 Comparing Two Proportions

STA Module 10 Comparing Two Proportions STA 2023 Module 10 Comparing Two Proportions Learning Objectives Upon completing this module, you should be able to: 1. Perform large-sample inferences (hypothesis test and confidence intervals) to compare

More information

Distributions of linear combinations

Distributions of linear combinations Distributions of linear combinations CE 311S MORE THAN TWO RANDOM VARIABLES The same concepts used for two random variables can be applied to three or more random variables, but they are harder to visualize

More information

One-Sample Numerical Data

One-Sample Numerical Data One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Visual interpretation with normal approximation

Visual interpretation with normal approximation Visual interpretation with normal approximation H 0 is true: H 1 is true: p =0.06 25 33 Reject H 0 α =0.05 (Type I error rate) Fail to reject H 0 β =0.6468 (Type II error rate) 30 Accept H 1 Visual interpretation

More information

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. Rebecca Barter March 16, 2015 The χ 2 distribution The χ 2 distribution We have seen several instances

More information

6.4 Type I and Type II Errors

6.4 Type I and Type II Errors 6.4 Type I and Type II Errors Ulrich Hoensch Friday, March 22, 2013 Null and Alternative Hypothesis Neyman-Pearson Approach to Statistical Inference: A statistical test (also known as a hypothesis test)

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

20 Hypothesis Testing, Part I

20 Hypothesis Testing, Part I 20 Hypothesis Testing, Part I Bob has told Alice that the average hourly rate for a lawyer in Virginia is $200 with a standard deviation of $50, but Alice wants to test this claim. If Bob is right, she

More information

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11 Econometrics A Keio University, Faculty of Economics Simple linear model (2) Simon Clinet (Keio University) Econometrics A October 16, 2018 1 / 11 Estimation of the noise variance σ 2 In practice σ 2 too

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

Design of Engineering Experiments

Design of Engineering Experiments Design of Engineering Experiments Hussam Alshraideh Chapter 2: Some Basic Statistical Concepts October 4, 2015 Hussam Alshraideh (JUST) Basic Stats October 4, 2015 1 / 29 Overview 1 Introduction Basic

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Steve Smith Tuition: Maths Notes

Steve Smith Tuition: Maths Notes Maths Notes : Discrete Random Variables Version. Steve Smith Tuition: Maths Notes e iπ + = 0 a + b = c z n+ = z n + c V E + F = Discrete Random Variables Contents Intro The Distribution of Probabilities

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

1 A brief primer on probability distributions

1 A brief primer on probability distributions Inference, Models and Simulation for Comple Systems CSCI 7- Lecture 3 August Prof. Aaron Clauset A brief primer on probability distributions. Probability distribution functions A probability density function

More information

Statistical Inference

Statistical Inference Statistical Inference Bernhard Klingenberg Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Outline Estimation: Review of concepts

More information

Evaluating Hypotheses

Evaluating Hypotheses Evaluating Hypotheses IEEE Expert, October 1996 1 Evaluating Hypotheses Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal distribution,

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 6 Patrick Breheny University of Iowa to Biostatistics (BIOS 4120) 1 / 36 Our next several lectures will deal with two-sample inference for continuous

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Class 6 AMS-UCSC Thu 26, 2012 Winter 2012. Session 1 (Class 6) AMS-132/206 Thu 26, 2012 1 / 15 Topics Topics We will talk about... 1 Hypothesis testing

More information