Solutions - Homework #1

Size: px
Start display at page:

Download "Solutions - Homework #1"

Transcription

1 Solutions - Homework #1 1. Problem 1: Below appears a summary of the paper The pattern of a host-parasite distribution by Schmid & Robinson (197). Using the gnat Culicoides crepuscularis as a host specimen and the filarial nematode Chandlerella quiscali as a parasite, the variation in the pattern of distribution of parasites in the host was studied to assess whether the pattern was random (Poisson) or not. Specifically, 143 gnats were examined from the same infected purple grackle and the number of nematodes counted on each gnat. Three analyses of these count data were performed using chi-square goodness of fit techniques to assess whether the data might have arisen from a Poisson distribution. Whether observing only the infected gnats or all gnats, the observed frequencies of nematodes did not appear to fit a Poisson model (p <.1), but fit a negative binomial model quite closely. This finding is supported by the high variance/mean ratio of 6.68 and the clumped distribution of nematodes per gnat. Schmid & Robinson list four plausible explanations for the clumped parasite pattern within this host and provide examples of other studies attributing this clumpiness to variation in nematode density. Their basic message is to emphasize the importance of pattern analysis in the study of parasitism.. Problem : We are given two sets of criteria for evaluating whether or not the HAART treatment was a success or a failure (virological - the gold standard, and clinical/immunological). The simplest way to visualize the information provided regarding the number of cases resulting in successes or failure from these two types of evaluation is to make a x table of counts, as given to the right. With 37 total patients, the counts of 14, 48, and 3 were first placed in this table, and the remaining values filled in according to the required row and column totals. With this table then, we recognize that A, B, C, and D as defined in Box.4 of the Ecology of Wildlife Diseases book are given is labeled in the table. Computing then, we have: sensitivity = specificity = observed prevalence = true prevalence = Clinical/Immunological HAART HAART Failure Success Total HAART Failure Virological A C HAART Success B Total true-failure all virological failures = A A+C = = 3 14 =.143, true-success all virological successes = D B +D = =.856, all clin/imm. failures = A+B number examined obs.prev. + specif. - 1 sens.+spec. 1 = = D N = =.1468, (A+B)/N +D/(B +D) 1 A/(A+C) B/(B +D) =.3.75 =.48. The small sensitivity value indicates that the clinical/immunological tests do a poor job indicating failure of the HAART therapy when in fact it has failed. On the other hand, the high specificity value indicates that when the treatment is a success, the clinical/immunological evaluations correctly identify it as a success about 86% of the time. The observed prevalence is just the proportion of 1

2 HAART therapy cases deemed to have failed according to the clinical/immunological evaluation; in this case, this evaluation found that the therapy failed 14.68% of the time. However, when accounting for the misclassifications made as measured by the sensitivity and specificity, the true prevalence of HAART failures was just 4.8%. This smaller value results from the large number of false positive results, where 45 patients were deemed to have HAART therapy failure, when in fact they did not. 3. Problem 3: From a randomized experiment involving 36 mice, data were collected on the percentages of Fe 3+ and Fe 4+ retained after a fixed time interval. The resulting percentages for the two groups of 18 mice each are displayed in the boxplot to the right. Iron retention percentages tend to be higher for the Fe 4+ group. The distribution of Fe 4+ retention percentages is somewhat skewed to the right with percentages ranging from.% to 11.65%, centered at a median of 5.75% with one mild outlier at 1.45%. The Fe 3+ distribution of percentages is fairly symmetric with values ranging from.71% to 5.6%, centered at a median of 3.48% with two outliers at 8.15% & 8.4%. 4. Problem 4 Percentage Iron Retained Fe 3+ Fe 4+ Iron Group (a) The MatLab code to conduct this simulation is given at the end of this solutions handout. (b) The five histograms of the variance-to-mean ratios for randomly generated samples of sizes n = 1,5,5,, and 5 from a Poisson (θ = ) distribution are shown below. 5 Poisson Ratios (n=1) 3 Poisson Ratios (n=5) Poisson Ratios (n=5) Poisson Ratios (n=) Poisson Ratios (n=5) In viewing these histograms, the distributions of variance-to-mean ratios for all sample sizes are centered at a ratio of 1 and are roughly symmetric, with some evidence of slight right-skewness, especially for the smaller sample sizes. The primary difference between these distributions is the

3 variability in the ratios. For a sample size of 1, ratios could be as small as. and as large as.5, whereas all ratios are within about. of 1 for n = 5. It is clear that deciding whether or not a variance-to-mean ratio differs from 1 (indicating a departure from randomness) depends critically on the sample size. (c) The table to the right summarizes the ratios for the five cases. As with the histograms, the confidence intervals in this table clearly indicate the decreasing variability in the variance-to-mean ratio as a function of sample size. This can also be easily observed in the ratio standard deviations. Although there was some skewness evident in the Sample Mean of SD of Confidence Size n Ratios Ratios Interval (.8,.6) (.5, 1.7) (.63, 1.46) (.76, 1.3) (.88, 1.13) ratio distributions, there does not appear to be any systematic bias, as the ratio means are all very close to the Poisson ratio of 1. (d) If someone were to come to me with count data having a mean of and a variance-to-mean ratio of 1.5 wondering whether or nor they were statistically aggregated, I would tell her that it depends on the sample size of her data. Based on this small-scale simulation study, a variance-to-mean ratio of 1.5 would not be unusual for sample sizes of 1 or 5 (since the 95% confidence intervals for the ratio contain 1.5); however, for samples of size 5 or greater, a ratio of 1.5 is unusual given the random distribution of ratios from our simulation and we might be more inclined to conclude that the data are aggregated. Given the difficulty finding a confidence interval for a variance-to-mean ratio analytically, a simulation such as this is useful for studying the properties of this ratio. 5. Problem 5 (a) Taking the hint, the log-likelihood function for θ is given by: n [ θ x i e θ ] n n n log L(θ x) = log = log θ +log e θ log 1/x i! x i! = logθ +loge nθ K (where K is a constant with respect to θ) = x i logθ nθ K. Differentiating the log-likelihood with respect to θ and setting it equal to : logl(θ x) θ = θ= θ θ n = θ = n = x. logl(θ x) Taking the nd derivative of logl(θ x) with respect to θ: θ = <, so θ= θ θ that θ is a local mamum. Since θ was uniquely determined, then θ is the absolute mamum and hence the MLE of θ. (b) A histogram of the number of tapeworms per perch is shown to the right at the top of the next page (MatLab code given at the end of the solutions). In viewing these data, the distribution of the number oftapewormsper perchis highly right-skewedwith morethan 75%ofthe gnatshaving 1 tapeworm orfewer. The median number oftapeworms is 1 and the number oftapewormsranged from to 6 per perch. These data have a variance-to-mean ratio of s /m = 1.69/.888= 1., so based on the simulation in Problem 4, these data appear more aggregated than would be expected from a Poisson distribution. 3

4 (c) The MLE of θ for these data is computed as: θ = n = 168+(75)+3(3)+4(7)+5()+6(1) 5 = =.888. To find the expected frequency for X = x tapeworms per perch for x =,1,..., we multiply the total number of tapeworms (n = 5) by: P(X = x) = θ x e θ x! Doing so gives the frequency table to the right. The calculations in this table were made using MatLab as shown at the end of the solutions. (d) Computing the chi-square test statistic using the cells in the frequency table of part (e): D = =.888x e.888. x! Histogram of Tapeworm Counts # Tapeworms per perch # tapeworms Observed Expected per perch P(X = x) (O i E i ) [ ] ( ) = + + (1 6.5) E i = = 9.8. With g = 5 groups, there are g = 3 degrees of freedom for this test. The p-value for this test (the likelihood of getting a value of 9.8 or greater from a chi-square distribution with 3 d.f.) is computed in MatLab as: 1-chicdf(D,3) =.58. This p-value indicates moderate evidence of lack of fit to a Poisson model. MatLab Code Used for Homework #1 % ======================== % % Problem 3: Iron Data EDA % % ======================== % load irondiet.mat boxplot(irondiet.fe,irondiet.type, labels,{ Fe 3+, Fe 4+ }) xlabel( Iron Group, fontsize,14); ylabel( Percentage Iron Retained, fontsize,14) median(irondiet.fe(irondiet.type==3)) median(irondiet.fe(irondiet.type==4)) % ============================= % % Problem 4: Poisson simulation % % ============================= % theta = ; % Assigns the Poisson mean at n = [ ]; % Vector of 5 sample sizes for i = 1:5 % Begins loop through n-values nsim = ; % Sets the number of simulations 4

5 for j = 1:nsim % Loops through simulations dat = poissrnd(theta,n(i),1); % Generates n Poisson() values ratio(j) = var(dat)/mean(dat); % Computes var-to-mean ratio end % End simulation loop mrat(i) = mean(ratio); % Computes mean of ratios sdrat(i) = std(ratio); % Computes SD of ratios ratio = sort(ratio); % Sorts the ratios ci.low(i) = ratio(round(nsim*.5)); % Ratio CI lower limit ci.upp(i) = ratio(round(nsim*.975)); % Ratio CI upper limit subplot(3,,i); % ith plot in 3x window hist(ratio) % Histogram of the ratios xlabel( Variance-to-Mean Ratios ) % x-as label on plot ylabel( ) % y-as label on plot title([ Poisson Ratios (n=,numstr(n(i)), ) ]) xlim([.5]); % Sets x-as limits end % End of sample size loop % ======================================= % % Problem 5: Histogram of Tapeworm Counts % % ======================================= % tapeworm = [zeros(1,35) ones(1,168)... % Vector of tapeworm counts *ones(1,75) 3*ones(1,3)... % created efficiently using 4*ones(1,7) 5 5 6]; % the "ones" function breaks = :6; % Centers for histogram bars hist(tapeworm,breaks) % Histogram of tapeworm counts xlabel( # Tapeworms per perch ) ylabel( ) title( Histogram of Tapeworm Counts ) % ======================================== % % Problem 5: MLE & Chi-Square Computations % % ======================================== % mle = mean(tapeworm); % Mean of tapeworm counts px = poisspdf(:3,mle); % Poisson probs for -3 px = [px,(1-sum(px))]; % Adds prob. for >= 4 expfreq = px*length(tapeworm); % Computes expected frequencies breaks = :4; % Centers for histogram counts [obsfreq,mid] = hist(tapeworm,breaks); % Histogram counts in "obsfreq" D = sum((obsfreq-expfreq).^./expfreq); % Computes chi-squared statistic pval = 1 - chicdf(d,3); % Computes chi-squared p-value 5

Solutions - Homework #2

Solutions - Homework #2 45 Scatterplot of Abundance vs. Relative Density Parasite Abundance 4 35 3 5 5 5 5 5 Relative Host Population Density Figure : 3 Scatterplot of Log Abundance vs. Log RD Log Parasite Abundance 3.5.5.5.5.5

More information

Solutions - Homework #2

Solutions - Homework #2 Solutions - Homework #2 1. Problem 1: Biological Recovery (a) A scatterplot of the biological recovery percentages versus time is given to the right. In viewing this plot, there is negative, slightly nonlinear

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample

More information

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009 EAMINERS REPORT & SOLUTIONS STATISTICS (MATH 400) May-June 2009 Examiners Report A. Most plots were well done. Some candidates muddled hinges and quartiles and gave the wrong one. Generally candidates

More information

Ch18 links / ch18 pdf links Ch18 image t-dist table

Ch18 links / ch18 pdf links Ch18 image t-dist table Ch18 links / ch18 pdf links Ch18 image t-dist table ch18 (inference about population mean) exercises: 18.3, 18.5, 18.7, 18.9, 18.15, 18.17, 18.19, 18.27 CHAPTER 18: Inference about a Population Mean The

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

REVIEW: Midterm Exam. Spring 2012

REVIEW: Midterm Exam. Spring 2012 REVIEW: Midterm Exam Spring 2012 Introduction Important Definitions: - Data - Statistics - A Population - A census - A sample Types of Data Parameter (Describing a characteristic of the Population) Statistic

More information

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t t Confidence Interval for Population Mean Comparing z and t Confidence Intervals When neither z nor t Applies

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Bayesian Analysis - A First Example

Bayesian Analysis - A First Example Bayesian Analysis - A First Example This script works through the example in Hoff (29), section 1.2.1 We are interested in a single parameter: θ, the fraction of individuals in a city population with with

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Statistics 135 Fall 2007 Midterm Exam

Statistics 135 Fall 2007 Midterm Exam Name: Student ID Number: Statistics 135 Fall 007 Midterm Exam Ignore the finite population correction in all relevant problems. The exam is closed book, but some possibly useful facts about probability

More information

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement

More information

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2 STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square

More information

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score Interpret Standard Deviation Outlier Rule Linear Transformations Describe the Distribution OR Compare the Distributions SOCS Using Normalcdf and Invnorm (Calculator Tips) Interpret a z score What is an

More information

CH.8 Statistical Intervals for a Single Sample

CH.8 Statistical Intervals for a Single Sample CH.8 Statistical Intervals for a Single Sample Introduction Confidence interval on the mean of a normal distribution, variance known Confidence interval on the mean of a normal distribution, variance unknown

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If

More information

Math 361. Day 3 Traffic Fatalities Inv. A Random Babies Inv. B

Math 361. Day 3 Traffic Fatalities Inv. A Random Babies Inv. B Math 361 Day 3 Traffic Fatalities Inv. A Random Babies Inv. B Last Time Did traffic fatalities decrease after the Federal Speed Limit Law? we found the percent change in fatalities dropped by 17.14% after

More information

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests) Dr. Maddah ENMG 617 EM Statistics 10/15/12 Nonparametric Statistics (2) (Goodness of fit tests) Introduction Probability models used in decision making (Operations Research) and other fields require fitting

More information

3. DISCRETE PROBABILITY DISTRIBUTIONS

3. DISCRETE PROBABILITY DISTRIBUTIONS 1 3. DISCRETE PROBABILITY DISTRIBUTIONS Probability distributions may be discrete or continuous. This week we examine two discrete distributions commonly used in biology: the binomial and Poisson distributions.

More information

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is

More information

IV. The Normal Distribution

IV. The Normal Distribution IV. The Normal Distribution The normal distribution (a.k.a., a the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments /4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean Interval estimation October 3, 2018 STAT 151 Class 7 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0

More information

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population

More information

Unit 27 One-Way Analysis of Variance

Unit 27 One-Way Analysis of Variance Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

n y π y (1 π) n y +ylogπ +(n y)log(1 π).

n y π y (1 π) n y +ylogπ +(n y)log(1 π). Tests for a binomial probability π Let Y bin(n,π). The likelihood is L(π) = n y π y (1 π) n y and the log-likelihood is L(π) = log n y +ylogπ +(n y)log(1 π). So L (π) = y π n y 1 π. 1 Solving for π gives

More information

Created by T. Madas. Candidates may use any calculator allowed by the regulations of this examination.

Created by T. Madas. Candidates may use any calculator allowed by the regulations of this examination. IYGB GCE Mathematics MMS Advanced Level Practice Paper Q Difficulty Rating: 3.400/0.6993 Time: 3 hours Candidates may use any calculator allowed by the regulations of this examination. Information for

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

Chapter 23. Inference About Means

Chapter 23. Inference About Means Chapter 23 Inference About Means 1 /57 Homework p554 2, 4, 9, 10, 13, 15, 17, 33, 34 2 /57 Objective Students test null and alternate hypotheses about a population mean. 3 /57 Here We Go Again Now that

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Pump failure data. Pump Failures Time

Pump failure data. Pump Failures Time Outline 1. Poisson distribution 2. Tests of hypothesis for a single Poisson mean 3. Comparing multiple Poisson means 4. Likelihood equivalence with exponential model Pump failure data Pump 1 2 3 4 5 Failures

More information

STAT 536: Genetic Statistics

STAT 536: Genetic Statistics STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,

More information

Final Exam. Name: Solution:

Final Exam. Name: Solution: Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

MAT Mathematics in Today's World

MAT Mathematics in Today's World MAT 1000 Mathematics in Today's World Last Time 1. Three keys to summarize a collection of data: shape, center, spread. 2. Can measure spread with the fivenumber summary. 3. The five-number summary can

More information

BIOS 625 Fall 2015 Homework Set 3 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the

More information

Lecture 10: Generalized likelihood ratio test

Lecture 10: Generalized likelihood ratio test Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual

More information

Homework Example Chapter 1 Similar to Problem #14

Homework Example Chapter 1 Similar to Problem #14 Chapter 1 Similar to Problem #14 Given a sample of n = 129 observations of shower-flow-rate, do this: a.) Construct a stem-and-leaf display of the data. b.) What is a typical, or representative flow rate?

More information

Week 4 Concept of Probability

Week 4 Concept of Probability Week 4 Concept of Probability Mudrik Alaydrus Faculty of Computer Sciences University of Mercu Buana, Jakarta mudrikalaydrus@yahoo.com 1 Introduction : A Speech hrecognition i System computer communication

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1 4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions

More information

CS 5014: Research Methods in Computer Science. Bernoulli Distribution. Binomial Distribution. Poisson Distribution. Clifford A. Shaffer.

CS 5014: Research Methods in Computer Science. Bernoulli Distribution. Binomial Distribution. Poisson Distribution. Clifford A. Shaffer. Department of Computer Science Virginia Tech Blacksburg, Virginia Copyright c 2015 by Clifford A. Shaffer Computer Science Title page Computer Science Clifford A. Shaffer Fall 2015 Clifford A. Shaffer

More information

STATISTICS 141 Final Review

STATISTICS 141 Final Review STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

Topic 8. Data Transformations [ST&D section 9.16]

Topic 8. Data Transformations [ST&D section 9.16] Topic 8. Data Transformations [ST&D section 9.16] 8.1 The assumptions of ANOVA For ANOVA, the linear model for the RCBD is: Y ij = µ + τ i + β j + ε ij There are four key assumptions implicit in this model.

More information

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33 Introduction 1 STA442/2101 Fall 2016 1 See last slide for copyright information. 1 / 33 Background Reading Optional Chapter 1 of Linear models with R Chapter 1 of Davison s Statistical models: Data, and

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Chapter 6 The Standard Deviation as a Ruler and the Normal Model Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread

More information

Additional Problems Additional Problem 1 Like the http://www.stat.umn.edu/geyer/5102/examp/rlike.html#lmax example of maximum likelihood done by computer except instead of the gamma shape model, we will

More information

Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean

Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean Confidence Intervals Confidence interval for sample mean The CLT tells us: as the sample size n increases, the sample mean is approximately Normal with mean and standard deviation Thus, we have a standard

More information

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing So, What is Statistics? Theory and techniques for learning from data How to collect How to analyze How to interpret

More information

STAT2201 Assignment 3 Semester 1, 2017 Due 13/4/2017

STAT2201 Assignment 3 Semester 1, 2017 Due 13/4/2017 Class Example 1. Single Sample Descriptive Statistics (a) Summary Statistics and Box-Plots You are working in factory producing hand held bicycle pumps and obtain a sample of 174 bicycle pump weights in

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

IV. The Normal Distribution

IV. The Normal Distribution IV. The Normal Distribution The normal distribution (a.k.a., the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

Part 7: Glossary Overview

Part 7: Glossary Overview Part 7: Glossary Overview In this Part This Part covers the following topic Topic See Page 7-1-1 Introduction This section provides an alphabetical list of all the terms used in a STEPS surveillance with

More information

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables Probability Distributions for Continuous Variables Probability Distributions for Continuous Variables Let X = lake depth at a randomly chosen point on lake surface If we draw the histogram so that the

More information

A C E. Answers Investigation 4. Applications

A C E. Answers Investigation 4. Applications Answers Applications 1. 1 student 2. You can use the histogram with 5-minute intervals to determine the number of students that spend at least 15 minutes traveling to school. To find the number of students,

More information

F79SM STATISTICAL METHODS

F79SM STATISTICAL METHODS F79SM STATISTICAL METHODS SUMMARY NOTES 9 Hypothesis testing 9.1 Introduction As before we have a random sample x of size n of a population r.v. X with pdf/pf f(x;θ). The distribution we assign to X is

More information

STAT 6350 Analysis of Lifetime Data. Probability Plotting

STAT 6350 Analysis of Lifetime Data. Probability Plotting STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Introduction to Statistical Data Analysis Lecture 4: Sampling

Introduction to Statistical Data Analysis Lecture 4: Sampling Introduction to Statistical Data Analysis Lecture 4: Sampling James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis 1 / 30 Introduction

More information

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I. Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

Counting principles, including permutations and combinations.

Counting principles, including permutations and combinations. 1 Counting principles, including permutations and combinations. The binomial theorem: expansion of a + b n, n ε N. THE PRODUCT RULE If there are m different ways of performing an operation and for each

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Percentage point z /2

Percentage point z /2 Chapter 8: Statistical Intervals Why? point estimate is not reliable under resampling. Interval Estimates: Bounds that represent an interval of plausible values for a parameter There are three types of

More information

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700 Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving

More information

Statistical methods. Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra

Statistical methods. Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra Statistical methods Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra Statistical methods Generating random numbers MATLAB has many built-in functions

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Bootstrapping, Permutations, and Monte Carlo Testing

Bootstrapping, Permutations, and Monte Carlo Testing Bootstrapping, Permutations, and Monte Carlo Testing Problem: Population of interest is extremely rare spatially and you are interested in using a 95% CI to estimate total abundance. The sampling design

More information

Data Analysis and Uncertainty Part 2: Estimation

Data Analysis and Uncertainty Part 2: Estimation Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval

More information

13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA

More information

Using R in Undergraduate Probability and Mathematical Statistics Courses. Amy G. Froelich Department of Statistics Iowa State University

Using R in Undergraduate Probability and Mathematical Statistics Courses. Amy G. Froelich Department of Statistics Iowa State University Using R in Undergraduate Probability and Mathematical Statistics Courses Amy G. Froelich Department of Statistics Iowa State University Undergraduate Probability and Mathematical Statistics at Iowa State

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE Course Title: Probability and Statistics (MATH 80) Recommended Textbook(s): Number & Type of Questions: Probability and Statistics for Engineers

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Machine Learning, Fall 2012 Homework 2

Machine Learning, Fall 2012 Homework 2 0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0

More information

Independent Samples ANOVA

Independent Samples ANOVA Independent Samples ANOVA In this example students were randomly assigned to one of three mnemonics (techniques for improving memory) rehearsal (the control group; simply repeat the words), visual imagery

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

Sampling Variability and Confidence Intervals. John McGready Johns Hopkins University

Sampling Variability and Confidence Intervals. John McGready Johns Hopkins University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information