1 Review of Probability & Statistics

Size: px
Start display at page:

Download "1 Review of Probability & Statistics"

Transcription

1 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over smokers over 5 44 over 5 who both smoke ad imbibe 50 people uder 5 who either smoke or imbibe Determie whether the above report is cosistet. Solutio: Let A be the set of smokers, B be the set of people over 5, ad C be the set of people who imbibe. So we have A = 61, B = 670, C = 960, A B = 86, B C = 90, A C = 180, A B C = 44, A B C = 50. O oe had, we have A B C = 000 A B C = 000 A B C = = O the other had, we have A B C = A + B + C ( A B + A C + B C ) + A B C = ( ) + 44 = So, the above report is NOT cosistet. b. A game begis by choosig betwee dice A ad B i some maer such that the probability that A is selected is p. The die thus selected is the tossed util a white face appears, at which time the game is cocluded. Die A has 4 red ad white faces. Die B has red ad 4 white faces. After playig this game a great may times, it is observed that the probability that a game is cocluded i exactly 3 tosses of the selected die is 7/81. Determie the value of p. Solutio: The probability of dice B is selected is 1 p. So if A is selected, the probability of a toss beig red face is 4/6 (because dice A has 4 red faces), ad a toss beig white face is /6. Similarly, if dice B is selected, the probability of a toss beig red face is /6, ad a toss beig white face is 4/6. If a game is cocluded i exactly 3 tosses, the the first ad the secod toss must be a red face, ad the third be a white face. So we have p (1 p) 4 = 7, i.e., p + 7 (1 p) = 7 81, hece p =

2 c. Compaies A, B, C, D ad E each sed three delegates to a coferece. A committee of four delegates, selected by lot, is formed. Determie the probability that: 1. compay A is ot represeted o the committee.. compay A has exactly oe represetative o the committee. 3. either compay A or compay E is represeted o the committee. Solutio: If we do t have limitatios, the i order to form the committee, we just have to select four people from 15 persos. The total umber of these combiatios ca be deoted by C 4 15 (select 4 out of 15). 1. If compay A is ot represeted o the committee, the we have to select 4 people from compaies B, C, D, ad E, with C 4 1 combiatios. So the probability i this case is C4 1 C 4 15 = = = If A has exactly oe represetative o the committee, the we have to select the other 3 people from compaies B, C, D, ad E. The umber of these combiatios is C3C So the probability i this case is C1 3 C3 1 = 44 = C If either compay A or compay E is represeted o the committee, the we have to choose 9 people from compaies B, C, ad E, with C9 4 combiatios. So the probability i this case is C4 9 C 4 15 = 6 65 = d. Two batches of a certai chemical were delivered to a factory. For each batch te determiatios were made of the percetage of magaese i the chemical. The results were as follows: Batch 1: Batch : Is there a sigificat differece betwee the two sample meas (averages) of Batch 1 ad Batch? Solutio: Let µ 1 ad µ be the true percetages of magaese i batch 1 ad batch, respectively, the the ull hypothesis is give by H 0 : µ 1 = µ, ad the alterative hypothesis is give by H 1 : µ 1 µ. I this case we do a two-tailed test ad assume both sets of data are ormally distributed. So for Batch 1, the mea i=1 x i x 1 = = 3.6, ad sample variaces s 1 = Ad for Batch, the mea i=1 x i i=1 x i ( i=1 x i ) 1 = i=1 x i ( i=1 x i ) x = = 3.5, ad sample variaces s = = So the combied estimate of the variace is give by s = s 1 +s = Hece s = Thus the test statistic is give by t 0 = x 1 x s =.98.

3 By checkig the t-table, we have t 0.05,18 =.1. Sice t 0 > t 0.05,18, it s reasoable for us to reject H 0, i.e., there is a sigificat differece betwee two batches at a 95% cofidece level. e. For questio d, check whether the sample variaces of the two batches are estimates of the same populatio variace. Solutio: Suppose these two samples are draw from a ormal distributio with variace. Let s 1 be the estimate of 1, ad let s be the estimate of. We assume both populatios are ormally distributed ad use the F-test to test the ull hypothesis H 0 : 1 =, ad the alterative hypothesis is give by H 1 : 1. The test statistic is give by F 0 = s 1 s = = 1.8. By checkig the F -table, we have F 0.05,9,9 = 4.03, which is bigger tha F 0, so we ca accept H 0, i.e., the sample variaces of the two batches are estimates of the same populatio variace. f. A die is tossed 10 times. Fid the probability that a four will tur up less tha 15 times. Show the precise formula ad its approximatio by a ormal distributio. Solutio: For each toss, the probability p that a four turs up is 1, i.e., p = Hece the probability it does ot tur up is 1 p. So if four turs up less tha 15 times after 10 tosses, the it ca tur up from 0 to 14 times. If it turs up k (0 k 14) times out of 10, the it does ot tur up for 10 k times. Ad these k times ca appear i ay sequeces of 10 times, i.e., there is a total umber of C10 k possibilities. So the probability that four turs up less tha 15 times is give by: P r = 14 i=0 C i 10p i (1 p) 10 i (1) If we take its approximatio by a ormal distributio, we have mea p = 0 ad variace p(1 p) = 50. So it is give by: 3 P r = 14 0 e 1 (x µ) dx () π Let v = x µ, the equatio () is equivalet to P r = b a e x π dx (3) where a = 0 p = 4.90, ad b = 14 p = So P r = Φ(b) Φ(a) = =

4 g. A electroic compoet is mass-produced ad the tested uit-by-uit o a automatic testig machie which classifies the uit as good or defective. But there is a probability of 0.1 that the machie will misclassify the uit, so that each compoet is i fact tested five times ad regarded as good if so classified three or more times. What is ow the probability of misclassificatio? Solutio: Now the compoet is tested five times, so oly whe three or more times it is misclassified will it be regarded as defective. If it is misclassified k (3 k 5) times, the the misclassificatio ca appear i ay sequeces of the 5 times, i.e., there are C k 5 possibilities. So the probability of misclassificatio is ow give by: P r = 5 C50.1 i i (1 0.1) 5 i (4) i=3 which is C (1 0.1) +C (1 0.1) 1 +C (1 0.1) 0 = = Decisio Tree Learig a. Trai a C5.0 classifier o the Adult data set. Familiarize yourself ad make maximum use of the features of the C5.0 program. Experimet with the pruig softeig threshold optios. Documet ad commet o your efforts to improve the performace of your classifier by tuig C5.0. Solutio: I did various experimets with differet optios of C5.0 program as follows: 1. Decisio tree: Trai the data set with basic optio: c5.0 -f adult. The size of the resultig decisio tree was quite large, 91. This classifier misclassified 393 cases o the traiig data, leadig to a error of 1.0%. The error rate o test data is 13.9%, which is ot quite satisfactory (see yogzhe/6505a1/a 1.txt for details).. Discrete value subsets: Sice the data set has o discrete values, it did ot make sese to test -s optio. 3. Rulesets: Trai the data set usig -r optio. The geerated classifier (called rulesets) cosists of 97 rules (see yogzhe/6505a1/a 3.txt for details). These rules ca be used to classify a ew case if all coditios are satisfied. The error rates o traiig data set ad test data set are almost the same as those i the first experimet. 4. Adaptive boostig: I tried to trai the data set usig -b optio, which is equivalet to -x. However, it took too log time (almost oe ad a half hours) to trai the classifier with boostig optio. The fial boost error o traiig data was as low as 3.0%, but as high as 15.3% o test data (see yogzhe/6505a1/a 4.txt for details). 4

5 CF 5% 50% 75% decisio tree size traiig data error 1.0%.4% 9.7% test data error 13.9% 14.6% 15.0% Table 1: Error rates o traiig data ad test data with differet cofidece values for decisio tree pruig m decisio tree size traiig data error 1.0% 1.8% 13.3% 13.9% 14.3% test data error 13.9% 14.0% 14.0% 14.1% 14.4% Table : Size of decisio tree resultig from umber of cases at each brach poit 5. Softeig thresholds: Trai the data set usig -p optio. The result (see yogzhe/6505a1/a 5.txt for details) showed that most thresholds i this experimet were still quite tight. However, the error rates ad decisio tree size did ot chage from those i step 1, i.e., this experimet did ot improve the accuracy of the classifier. 6. Cofidece for pruig: The default cofidece value of decisio tree pruig is 5%. As the cofidece value (optio -c) icreases, the program teds to do less pruig, leadig to the icreased size of the resultig decisio tree. I did tests with 50% ad 75% cofidece values, ad the sizes of the resultig trees are 9 ad 1391 (see yogzhe/6505a1/a 6.txt for details), respectively, compared with 91 with the default value. At the same time, less pruig results i overfittig o the traiig data ad performig worse o the test data. This is verified i Table Splittig size: C5.0 program has the optio -m to costrai the umber of cases (default ) at each brach poit. As we ca see i Table (more details ca be foud at yogzhe/6505a1/a 7.txt), whe m icreases, the size of the resultig tree decreases sigificatly, while the error rates o traiig data ad test data did ot icrease too much. 8. Dataset samplig: C5.0 has a optio -S to draw a sample from a large data set. It trais the classifier o this data sample ad the test it o a disjoit set of remaiig cases. The umber of cases i the test set depeds o the samplig value. I did tests o sample data of 0% to 80%, ad the results ( yogzhe/6505a1/a 8.txt) showed that there was sigificat differece betwee sizes of resultig decisio trees, but little differece betwee error rates. 9. Cross-validatio: f-fold cross-validatio divides the dataset ito f blocks of subsets with approximately equal size ad class distributio. Each tur the 5

6 hold-out subset is tested with the classifier traied o remaiig data cases. As ormal, I did -fold cross-validatio, ad the result ( yogzhe/6505a1/a 9.txt) showed that there was o sigificat improvemet i the performace. b. What is the error rate o the test data of the best decisio tree classifier you ca come up with? What is the 95% cofidece iterval of the estimate? Solutio: I the experimet ( yogzhe/6505a1/a 1.txt), the classifier misclassified 61 cases out of 1681 cases, leadig to the error rate of 13.9%. This is the best classifier I achieved i various experimets. Let r = 61 ad = 1681, we assume r = is a good estimate of the true error rate p. We have = p(1 p) = Ad we have ε = 1.96 for the 95% cofidece iterval of ormal distributio. So the 95% cofidece iterval is ε = c. This is a ubalaced data set (4% of istaces have label > 50K, 76% of istaces have label 50K. Repeat parts a ad b after balacig the traiig set. Balacig is doe by repeatig each istace i the > 50K class three times. Has balacig helped substatially improve the performace of the classifier? Is the differece betwee your error rate estimates i b ad c statistically sigificat (with 95% cofidece)? Solutio: I wrote a c program ( yogzhe/6505a1/balace.c) to duplicate cases with label > 50K ad I copied these cases ito adult data set such that it is balaced. The I repeated the ie experimets i a). The results showed that the error rate o traiig data has sigificatly decreased, mostly below.0%, but the error rate o test data has icreased very much, mostly above 18.0%. This ca be explaied by that the classifier were overfitted o the traiig data with regards to the triplicatio of cases with label > 50K, while the test data are far from balaced. I oe poit, the balacig did ot help sigificatly improve the performace of the classifier, uless we balace the test data, too. So our ull hypothesis here is that the performace of the classifier traied o ubalaced data is better tha that of o balaced data. The error rate of the best classifier i c is 17.4% ( yogzhe/6505a1/c 3.txt). So as a similar approach i questio 1d, we have s 1 = 7.351e 6, ad s = 8.88e 6. Hece s = s 1 +s 6. So the test statistic t 0 = 13.9% 17.4% 8.09e = 8.09e = We have t 0.05,18 =.1. Sice t 0 >> t 0.05,18, it s reasoable for us to reject the ull hypothesis, i.e., the performace o ubalaced data is much better tha that o balaced data. d. There is a Java implemetatio of the C4.5 decisio tree classifier that accompaies the textbook by Witte ad Frak: ml/weka/. C4.5 is the last publicly available versio of this family of classifiers, before they wet commercial as C5.0. Repeat a, b ad c usig C4.5. Is there a statistically sigificat beefit (i terms of improved performace, i.e. reduced error rate) to usig C5.0 istead of C4.5 o the adult dataset? 6

7 Solutio: First I extracted part of adult data with ative coutry beig Uited- States as the set for C4.5 program. There are 9170 ad 1466 cases i the ew traiig data set ad ew test data set, respectively. The best error rate o ubalaced test data I achieved is 14.0% ( yogzhe/6505a1/d 1.txt), ad 18.7% ( yogzhe/6505a1/d.txt) for balaced data. I traied C4.5 classifier with three optios, Cofidece for pruig, Splittig size, ad -fold cross validatio (with details at yogzhe/6505a1/d 6.txt, yogzhe/6505a1/d 7.txt, ad yogzhe/6505a1/d 9.txt, respectively). We have r = 05 = 0.14 as a estimate of true error p. So = p(1 p) = Ad we have ε = 1.96 for the 95% cofidece iterval of ormal distributio. So the 95% cofidece iterval is ε = As a similar approach i questio c, we have s 1 = 8.1e 6, ad s = 1.037e 5. Hece s = s 1 +s = 9.91e 6. So the test statistic t 0 = = We have t 0.05,18 =.1. Sice t 0 >> t 0.05,18, 14.0% 18.7% 9.91e it s reasoable to say the performace of C4.5 classifier o ubalaced data is much better tha that o balaced data. Actually it is ot suitable to compare performace of C5.0 ad C4.5 classifiers i this situatio because they are traied o differet data sets. Ay way, we have p 1 = 0.139, ad p = 0.140, 1 = 1681, ad = So s 1 = 7.351e 6, ad s = 8.1e 6. Hece s = s 1 +s t 0 = 13.9% 14.0% 7.78e = 7.78e 6. So the test statistic = 3.6. We have t 0.05,18 =.1. Sice t 0 > t 0.05,18, it s reasoable to coclude that C5.0 performs better tha C4.5 o the adult data set at a 95% cofidece level. e. I class we discussed how to classify a istace with missig values usig a already built decisio tree. Ca you outlie how the approach ca be adapted for use i traiig a decisio tree whe istaces of the traiig set may have missig values? Solutio: Suppose the istace with missig value o attribute A is of class C. The basic idea is the to assig the most probable value a i, of A to the missig value. This ca be doe i three steps: 1. Suppose attribute A has k observed values a 1, a,..., a k. The Cout the frequecies f 1, f,..., f k of the various values of A, separately.. Calculate the probability p i (1 i k) of each possible value a i (1 i k) based o f i (1 i k), i.e., p i =f i / k i=1 f i. Or more geerally, we ca assig a weight w i to each possible value of a i (1 i k). 3. Fially, we assig the correspodig value with maximum probability max k i=1 p i to the missig value. 7

8 3 Evaluatio of Hypothesis a. I a supervised learig problem, the test set S is of size. A hypothesis (a specific decisio tree for the problem) h classifies r istaces of S icorrectly. The sample error is therefore r, ad serves as a estimate of the true error p of h. Plot the size of the 95% cofidece iterval as a fuctio of (assume that r/ is approximately p, where p is ukow but costat). Solutio: Let error S (h) be the sample error r ad error D(h) be the true error p. Our objective here is to determie the size ε such that P r( error S (h) error D (h) < ε) = We assume r = p, where p is ukow but costat. We assume the sample S cotais examples idepedet of each other i h. The we defie a radom variable, t, where: t = 1 if failure, ad t = 0, if success. So P r(t = 1) = p, ad P r(t = 0) = 1 p. The mea of t is p ad variace of p is p(1 p). Now we defie variables, t 1, t,..., t, oe for each elemet of S ad let r = t 1 + t t. So mea of r is p ad variace of r is p(1 p). We treat error S (h) as a radom variable, so it has mea p ad variace p(1 p). The probability desity fuctio is give by: 1 (x µ) pdf(x) = e (5) π I particular, we have ε = 1.96 for ormal distributio such that P r( 1.96 x 1.96) = So based o equatio (5), we have the 95% cofidece iterval p(1 p). CI = 3.9 If we plot CI as a fuctio of, we have CI = (1.96 p(1 p)) 1. A approximatio of this curve is show i Figure 1. b. I a supervised learig problem, we wat to compare the goodess of two differet hypotheses h 1 ad h. Two idepedet test sets S 1 ad S of sizes 1 ad are used for h 1 ad h respectively. h 1 ad h make r 1 ad r errors of classificatio over S 1 ad S respectively. The true errors for h 1 ad h are p 1 ad p respectively. For 1 = 50, r 1 = 4, = 30, r = : fid a estimate of p 1 p, give the 95% cofidece iterval for your estimate, fid the probability that p 1 > p. Solutio: Let error D (h 1 ) ad error D (h ) be the true errors of h 1 ad h, respectively. Ad let error S1 (h 1 ) ad error S (h ) be the sample errors of S 1 ad S, respectively. Also let d = error D (h 1 ) error D (h ), i.e., d = p 1 p. 8

9 CI Figure 1: Cofidece iterval as a fuctio of We assume error S1 (h 1 ) ad error S (h ) are estimates of p 1 ad p, respectively, i.e., p 1 = r 1 1 =, ad p 5 = r = 1. The 15 d = error S1 (h 1 ) error S (h ) = 1 75 is a estimate of p 1 p. We assume d is ormally distributed. Its mea µ = p 1 p ad variace = p 1(1 p 1 ) 1 + p (1 p ) = Hece = I particular, we have ε = 1.96 for ormal distributio such that P r( 1.96 x 1.96) = So based o equatio (5), we have the iterval ε = 1.96 = First we have P r(p 1 > p ) = P r(p 1 p > 0) = P r(d > 0). So we have: P r(d > 0) = + 0 e 1 (x µ) dx (6) π Let v = x µ, the equatio (6) is equivalet to P r(d > 0) = b a e x π dx (7) where a = 0 µ = 0.39, ad b = +. So P r(p 1 > p ) = 1 Φ(a) = =

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency Math 152. Rumbos Fall 2009 1 Solutios to Review Problems for Exam #2 1. I the book Experimetatio ad Measuremet, by W. J. Youde ad published by the by the Natioal Sciece Teachers Associatio i 1962, the

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700 Class 23 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 2017 by D.B. Rowe 1 Ageda: Recap Chapter 9.1 Lecture Chapter 9.2 Review Exam 6 Problem Solvig Sessio. 2

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Statistics 20: Final Exam Solutions Summer Session 2007

Statistics 20: Final Exam Solutions Summer Session 2007 1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets

More information

NCSS Statistical Software. Tolerance Intervals

NCSS Statistical Software. Tolerance Intervals Chapter 585 Itroductio This procedure calculates oe-, ad two-, sided tolerace itervals based o either a distributio-free (oparametric) method or a method based o a ormality assumptio (parametric). A two-sided

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y. Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M. MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval

More information

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Topic 18: Composite Hypotheses

Topic 18: Composite Hypotheses Toc 18: November, 211 Simple hypotheses limit us to a decisio betwee oe of two possible states of ature. This limitatio does ot allow us, uder the procedures of hypothesis testig to address the basic questio:

More information

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate

More information

Module 1 Fundamentals in statistics

Module 1 Fundamentals in statistics Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Common Large/Small Sample Tests 1/55

Common Large/Small Sample Tests 1/55 Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions Chapter 11: Askig ad Aswerig Questios About the Differece of Two Proportios These otes reflect material from our text, Statistics, Learig from Data, First Editio, by Roxy Peck, published by CENGAGE Learig,

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

MA238 Assignment 4 Solutions (part a)

MA238 Assignment 4 Solutions (part a) (i) Sigle sample tests. Questio. MA38 Assigmet 4 Solutios (part a) (a) (b) (c) H 0 : = 50 sq. ft H A : < 50 sq. ft H 0 : = 3 mpg H A : > 3 mpg H 0 : = 5 mm H A : 5mm Questio. (i) What are the ull ad alterative

More information

Sample Size Determination (Two or More Samples)

Sample Size Determination (Two or More Samples) Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie

More information

University of California, Los Angeles Department of Statistics. Hypothesis testing

University of California, Los Angeles Department of Statistics. Hypothesis testing Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Elemets of a hypothesis test: Hypothesis testig Istructor: Nicolas Christou 1. Null hypothesis, H 0 (claim about µ, p, σ 2, µ

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

GG313 GEOLOGICAL DATA ANALYSIS

GG313 GEOLOGICAL DATA ANALYSIS GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

Lecture 10: Performance Evaluation of ML Methods

Lecture 10: Performance Evaluation of ML Methods CSE57A Machie Learig Sprig 208 Lecture 0: Performace Evaluatio of ML Methods Istructor: Mario Neuma Readig: fcml: 5.4 (Performace); esl: 7.0 (Cross-Validatio); optioal book: Evaluatio Learig Algorithms

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Topic 6 Sampling, hypothesis testing, and the central limit theorem

Topic 6 Sampling, hypothesis testing, and the central limit theorem CSE 103: Probability ad statistics Fall 2010 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem 61 The biomial distributio Let X be the umberofheadswhe acoiofbiaspistossedtimes The distributio

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters? CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter

More information

Homework 5 Solutions

Homework 5 Solutions Homework 5 Solutios p329 # 12 No. To estimate the chace you eed the expected value ad stadard error. To do get the expected value you eed the average of the box ad to get the stadard error you eed the

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

(7 One- and Two-Sample Estimation Problem )

(7 One- and Two-Sample Estimation Problem ) 34 Stat Lecture Notes (7 Oe- ad Two-Sample Estimatio Problem ) ( Book*: Chapter 8,pg65) Probability& Statistics for Egieers & Scietists By Walpole, Myers, Myers, Ye Estimatio 1 ) ( ˆ S P i i Poit estimate:

More information

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Lecture 6 Simple alternatives and the Neyman-Pearson lemma STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

(all terms are scalars).the minimization is clearer in sum notation:

(all terms are scalars).the minimization is clearer in sum notation: 7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 3 (This versio August 17, 014) 015 Pearso Educatio, Ic. Stock/Watso

More information

Stat 200 -Testing Summary Page 1

Stat 200 -Testing Summary Page 1 Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece

More information

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes. Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Discrete probability distributions

Discrete probability distributions Discrete probability distributios I the chapter o probability we used the classical method to calculate the probability of various values of a radom variable. I some cases, however, we may be able to develop

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

MATH/STAT 352: Lecture 15

MATH/STAT 352: Lecture 15 MATH/STAT 352: Lecture 15 Sectios 5.2 ad 5.3. Large sample CI for a proportio ad small sample CI for a mea. 1 5.2: Cofidece Iterval for a Proportio Estimatig proportio of successes i a biomial experimet

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis Sectio 9.2 Tests About a Populatio Proportio P H A N T O M S Parameters Hypothesis Assess Coditios Name the Test Test Statistic (Calculate) Obtai P value Make a decisio State coclusio Sectio 9.2 Tests

More information

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n. ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

Estimation of a population proportion March 23,

Estimation of a population proportion March 23, 1 Social Studies 201 Notes for March 23, 2005 Estimatio of a populatio proportio Sectio 8.5, p. 521. For the most part, we have dealt with meas ad stadard deviatios this semester. This sectio of the otes

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

5. A formulae page and two tables are provided at the end of Part A of the examination PART A

5. A formulae page and two tables are provided at the end of Part A of the examination PART A Istructios: 1. You have bee provided with: (a) this questio paper (Part A ad Part B) (b) a multiple choice aswer sheet (for Part A) (c) Log Aswer Sheet(s) (for Part B) (d) a booklet of tables. (a) I PART

More information

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Chapter 13: Tests of Hypothesis Section 13.1 Introduction Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed

More information

c. Explain the basic Newsvendor model. Why is it useful for SC models? e. What additional research do you believe will be helpful in this area?

c. Explain the basic Newsvendor model. Why is it useful for SC models? e. What additional research do you believe will be helpful in this area? 1. Research Methodology a. What is meat by the supply chai (SC) coordiatio problem ad does it apply to all types of SC s? Does the Bullwhip effect relate to all types of SC s? Also does it relate to SC

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Final Review for MATH 3510

Final Review for MATH 3510 Fial Review for MATH 50 Calculatio 5 Give a fairly simple probability mass fuctio or probability desity fuctio of a radom variable, you should be able to compute the expected value ad variace of the variable

More information

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion 1 Chapter 7 ad 8 Review for Exam Chapter 7 Estimates ad Sample Sizes 2 Defiitio Cofidece Iterval (or Iterval Estimate) a rage (or a iterval) of values used to estimate the true value of the populatio parameter

More information

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet

More information

Chapter 1 (Definitions)

Chapter 1 (Definitions) FINAL EXAM REVIEW Chapter 1 (Defiitios) Qualitative: Nomial: Ordial: Quatitative: Ordial: Iterval: Ratio: Observatioal Study: Desiged Experimet: Samplig: Cluster: Stratified: Systematic: Coveiece: Simple

More information

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual

More information

Kinetics of Complex Reactions

Kinetics of Complex Reactions Kietics of Complex Reactios by Flick Colema Departmet of Chemistry Wellesley College Wellesley MA 28 wcolema@wellesley.edu Copyright Flick Colema 996. All rights reserved. You are welcome to use this documet

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

( ) = p and P( i = b) = q.

( ) = p and P( i = b) = q. MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of

More information

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences. Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx

More information