Solutions - Homework #1
|
|
- Arnold Stewart
- 5 years ago
- Views:
Transcription
1 Solutions - Homework #1 1. Problem 1: Below appears a summary of the paper The pattern of a host-parasite distribution by Schmid & Robinson (197). Using the gnat Culicoides crepuscularis as a host specimen and the filarial nematode Chandlerella quiscali as a parasite, the variation in the pattern of distribution of parasites in the host was studied to assess whether the pattern was random (Poisson) or not. Specifically, 143 gnats were examined from the same infected purple grackle and the number of nematodes counted on each gnat. Three analyses of these count data were performed using chi-square goodness of fit techniques to assess whether the data might have arisen from a Poisson distribution. Whether observing only the infected gnats or all gnats, the observed frequencies of nematodes did not appear to fit a Poisson model (p <.1), but fit a negative binomial model quite closely. This finding is supported by the high variance/mean ratio of 6.68 and the clumped distribution of nematodes per gnat. Schmid & Robinson list four plausible explanations for the clumped parasite pattern within this host and provide examples of other studies attributing this clumpiness to variation in nematode density. Their basic message is to emphasize the importance of pattern analysis in the study of parasitism.. Problem : We are given two sets of criteria for evaluating whether or not the HAART treatment was a success or a failure (virological - the gold standard, and clinical/immunological). The simplest way to visualize the information provided regarding the number of cases resulting in successes or failure from these two types of evaluation is to make a x table of counts, as given to the right. With 37 total patients, the counts of 14, 48, and 3 were first placed in this table, and the remaining values filled in according to the required row and column totals. With this table then, we recognize that A, B, C, and D as defined in Box.4 of the Ecology of Wildlife Diseases book are given is labeled in the table. Computing then, we have: sensitivity = specificity = observed prevalence = true prevalence = Clinical/Immunological HAART HAART Failure Success Total HAART Failure Virological A C HAART Success B Total true-failure all virological failures = A A+C = = 3 14 =.143, true-success all virological successes = D B +D = =.856, all clin/imm. failures = A+B number examined obs.prev. + specif. - 1 sens.+spec. 1 = = D N = =.1468, (A+B)/N +D/(B +D) 1 A/(A+C) B/(B +D) =.3.75 =.48. The small sensitivity value indicates that the clinical/immunological tests do a poor job indicating failure of the HAART therapy when in fact it has failed. On the other hand, the high specificity value indicates that when the treatment is a success, the clinical/immunological evaluations correctly identify it as a success about 86% of the time. The observed prevalence is just the proportion of 1
2 HAART therapy cases deemed to have failed according to the clinical/immunological evaluation; in this case, this evaluation found that the therapy failed 14.68% of the time. However, when accounting for the misclassifications made as measured by the sensitivity and specificity, the true prevalence of HAART failures was just 4.8%. This smaller value results from the large number of false positive results, where 45 patients were deemed to have HAART therapy failure, when in fact they did not. 3. Problem 3: From a randomized experiment involving 36 mice, data were collected on the percentages of Fe 3+ and Fe 4+ retained after a fixed time interval. The resulting percentages for the two groups of 18 mice each are displayed in the boxplot to the right. Iron retention percentages tend to be higher for the Fe 4+ group. The distribution of Fe 4+ retention percentages is somewhat skewed to the right with percentages ranging from.% to 11.65%, centered at a median of 5.75% with one mild outlier at 1.45%. The Fe 3+ distribution of percentages is fairly symmetric with values ranging from.71% to 5.6%, centered at a median of 3.48% with two outliers at 8.15% & 8.4%. 4. Problem 4 Percentage Iron Retained Fe 3+ Fe 4+ Iron Group (a) The MatLab code to conduct this simulation is given at the end of this solutions handout. (b) The five histograms of the variance-to-mean ratios for randomly generated samples of sizes n = 1,5,5,, and 5 from a Poisson (θ = ) distribution are shown below. 5 Poisson Ratios (n=1) 3 Poisson Ratios (n=5) Poisson Ratios (n=5) Poisson Ratios (n=) Poisson Ratios (n=5) In viewing these histograms, the distributions of variance-to-mean ratios for all sample sizes are centered at a ratio of 1 and are roughly symmetric, with some evidence of slight right-skewness, especially for the smaller sample sizes. The primary difference between these distributions is the
3 variability in the ratios. For a sample size of 1, ratios could be as small as. and as large as.5, whereas all ratios are within about. of 1 for n = 5. It is clear that deciding whether or not a variance-to-mean ratio differs from 1 (indicating a departure from randomness) depends critically on the sample size. (c) The table to the right summarizes the ratios for the five cases. As with the histograms, the confidence intervals in this table clearly indicate the decreasing variability in the variance-to-mean ratio as a function of sample size. This can also be easily observed in the ratio standard deviations. Although there was some skewness evident in the Sample Mean of SD of Confidence Size n Ratios Ratios Interval (.8,.6) (.5, 1.7) (.63, 1.46) (.76, 1.3) (.88, 1.13) ratio distributions, there does not appear to be any systematic bias, as the ratio means are all very close to the Poisson ratio of 1. (d) If someone were to come to me with count data having a mean of and a variance-to-mean ratio of 1.5 wondering whether or nor they were statistically aggregated, I would tell her that it depends on the sample size of her data. Based on this small-scale simulation study, a variance-to-mean ratio of 1.5 would not be unusual for sample sizes of 1 or 5 (since the 95% confidence intervals for the ratio contain 1.5); however, for samples of size 5 or greater, a ratio of 1.5 is unusual given the random distribution of ratios from our simulation and we might be more inclined to conclude that the data are aggregated. Given the difficulty finding a confidence interval for a variance-to-mean ratio analytically, a simulation such as this is useful for studying the properties of this ratio. 5. Problem 5 (a) Taking the hint, the log-likelihood function for θ is given by: n [ θ x i e θ ] n n n log L(θ x) = log = log θ +log e θ log 1/x i! x i! = logθ +loge nθ K (where K is a constant with respect to θ) = x i logθ nθ K. Differentiating the log-likelihood with respect to θ and setting it equal to : logl(θ x) θ = θ= θ θ n = θ = n = x. logl(θ x) Taking the nd derivative of logl(θ x) with respect to θ: θ = <, so θ= θ θ that θ is a local mamum. Since θ was uniquely determined, then θ is the absolute mamum and hence the MLE of θ. (b) A histogram of the number of tapeworms per perch is shown to the right at the top of the next page (MatLab code given at the end of the solutions). In viewing these data, the distribution of the number oftapewormsper perchis highly right-skewedwith morethan 75%ofthe gnatshaving 1 tapeworm orfewer. The median number oftapeworms is 1 and the number oftapewormsranged from to 6 per perch. These data have a variance-to-mean ratio of s /m = 1.69/.888= 1., so based on the simulation in Problem 4, these data appear more aggregated than would be expected from a Poisson distribution. 3
4 (c) The MLE of θ for these data is computed as: θ = n = 168+(75)+3(3)+4(7)+5()+6(1) 5 = =.888. To find the expected frequency for X = x tapeworms per perch for x =,1,..., we multiply the total number of tapeworms (n = 5) by: P(X = x) = θ x e θ x! Doing so gives the frequency table to the right. The calculations in this table were made using MatLab as shown at the end of the solutions. (d) Computing the chi-square test statistic using the cells in the frequency table of part (e): D = =.888x e.888. x! Histogram of Tapeworm Counts # Tapeworms per perch # tapeworms Observed Expected per perch P(X = x) (O i E i ) [ ] ( ) = + + (1 6.5) E i = = 9.8. With g = 5 groups, there are g = 3 degrees of freedom for this test. The p-value for this test (the likelihood of getting a value of 9.8 or greater from a chi-square distribution with 3 d.f.) is computed in MatLab as: 1-chicdf(D,3) =.58. This p-value indicates moderate evidence of lack of fit to a Poisson model. MatLab Code Used for Homework #1 % ======================== % % Problem 3: Iron Data EDA % % ======================== % load irondiet.mat boxplot(irondiet.fe,irondiet.type, labels,{ Fe 3+, Fe 4+ }) xlabel( Iron Group, fontsize,14); ylabel( Percentage Iron Retained, fontsize,14) median(irondiet.fe(irondiet.type==3)) median(irondiet.fe(irondiet.type==4)) % ============================= % % Problem 4: Poisson simulation % % ============================= % theta = ; % Assigns the Poisson mean at n = [ ]; % Vector of 5 sample sizes for i = 1:5 % Begins loop through n-values nsim = ; % Sets the number of simulations 4
5 for j = 1:nsim % Loops through simulations dat = poissrnd(theta,n(i),1); % Generates n Poisson() values ratio(j) = var(dat)/mean(dat); % Computes var-to-mean ratio end % End simulation loop mrat(i) = mean(ratio); % Computes mean of ratios sdrat(i) = std(ratio); % Computes SD of ratios ratio = sort(ratio); % Sorts the ratios ci.low(i) = ratio(round(nsim*.5)); % Ratio CI lower limit ci.upp(i) = ratio(round(nsim*.975)); % Ratio CI upper limit subplot(3,,i); % ith plot in 3x window hist(ratio) % Histogram of the ratios xlabel( Variance-to-Mean Ratios ) % x-as label on plot ylabel( ) % y-as label on plot title([ Poisson Ratios (n=,numstr(n(i)), ) ]) xlim([.5]); % Sets x-as limits end % End of sample size loop % ======================================= % % Problem 5: Histogram of Tapeworm Counts % % ======================================= % tapeworm = [zeros(1,35) ones(1,168)... % Vector of tapeworm counts *ones(1,75) 3*ones(1,3)... % created efficiently using 4*ones(1,7) 5 5 6]; % the "ones" function breaks = :6; % Centers for histogram bars hist(tapeworm,breaks) % Histogram of tapeworm counts xlabel( # Tapeworms per perch ) ylabel( ) title( Histogram of Tapeworm Counts ) % ======================================== % % Problem 5: MLE & Chi-Square Computations % % ======================================== % mle = mean(tapeworm); % Mean of tapeworm counts px = poisspdf(:3,mle); % Poisson probs for -3 px = [px,(1-sum(px))]; % Adds prob. for >= 4 expfreq = px*length(tapeworm); % Computes expected frequencies breaks = :4; % Centers for histogram counts [obsfreq,mid] = hist(tapeworm,breaks); % Histogram counts in "obsfreq" D = sum((obsfreq-expfreq).^./expfreq); % Computes chi-squared statistic pval = 1 - chicdf(d,3); % Computes chi-squared p-value 5
Solutions - Homework #2
45 Scatterplot of Abundance vs. Relative Density Parasite Abundance 4 35 3 5 5 5 5 5 Relative Host Population Density Figure : 3 Scatterplot of Log Abundance vs. Log RD Log Parasite Abundance 3.5.5.5.5.5
More informationSolutions - Homework #2
Solutions - Homework #2 1. Problem 1: Biological Recovery (a) A scatterplot of the biological recovery percentages versus time is given to the right. In viewing this plot, there is negative, slightly nonlinear
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More informationEXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009
EAMINERS REPORT & SOLUTIONS STATISTICS (MATH 400) May-June 2009 Examiners Report A. Most plots were well done. Some candidates muddled hinges and quartiles and gave the wrong one. Generally candidates
More informationCh18 links / ch18 pdf links Ch18 image t-dist table
Ch18 links / ch18 pdf links Ch18 image t-dist table ch18 (inference about population mean) exercises: 18.3, 18.5, 18.7, 18.9, 18.15, 18.17, 18.19, 18.27 CHAPTER 18: Inference about a Population Mean The
More informationDover- Sherborn High School Mathematics Curriculum Probability and Statistics
Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and
More informationFundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur
Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new
More informationREVIEW: Midterm Exam. Spring 2012
REVIEW: Midterm Exam Spring 2012 Introduction Important Definitions: - Data - Statistics - A Population - A census - A sample Types of Data Parameter (Describing a characteristic of the Population) Statistic
More informationLecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t
Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t t Confidence Interval for Population Mean Comparing z and t Confidence Intervals When neither z nor t Applies
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationBayesian Analysis - A First Example
Bayesian Analysis - A First Example This script works through the example in Hoff (29), section 1.2.1 We are interested in a single parameter: θ, the fraction of individuals in a city population with with
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationStatistics 135 Fall 2007 Midterm Exam
Name: Student ID Number: Statistics 135 Fall 007 Midterm Exam Ignore the finite population correction in all relevant problems. The exam is closed book, but some possibly useful facts about probability
More informationStatistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018
Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement
More informationSome Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2
STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square
More informationInterpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score
Interpret Standard Deviation Outlier Rule Linear Transformations Describe the Distribution OR Compare the Distributions SOCS Using Normalcdf and Invnorm (Calculator Tips) Interpret a z score What is an
More informationCH.8 Statistical Intervals for a Single Sample
CH.8 Statistical Intervals for a Single Sample Introduction Confidence interval on the mean of a normal distribution, variance known Confidence interval on the mean of a normal distribution, variance unknown
More informationIntroduction to hypothesis testing
Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If
More informationMath 361. Day 3 Traffic Fatalities Inv. A Random Babies Inv. B
Math 361 Day 3 Traffic Fatalities Inv. A Random Babies Inv. B Last Time Did traffic fatalities decrease after the Federal Speed Limit Law? we found the percent change in fatalities dropped by 17.14% after
More informationDr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)
Dr. Maddah ENMG 617 EM Statistics 10/15/12 Nonparametric Statistics (2) (Goodness of fit tests) Introduction Probability models used in decision making (Operations Research) and other fields require fitting
More information3. DISCRETE PROBABILITY DISTRIBUTIONS
1 3. DISCRETE PROBABILITY DISTRIBUTIONS Probability distributions may be discrete or continuous. This week we examine two discrete distributions commonly used in biology: the binomial and Poisson distributions.
More informationChapter 2 Class Notes Sample & Population Descriptions Classifying variables
Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is
More informationIV. The Normal Distribution
IV. The Normal Distribution The normal distribution (a.k.a., a the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact
More informationChapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides
Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for
More information1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments
/4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population
More informationSTAT 200 Chapter 1 Looking at Data - Distributions
STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationInterval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean
Interval estimation October 3, 2018 STAT 151 Class 7 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0
More informationPermutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods
Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of
More informationChapter 23: Inferences About Means
Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population
More informationUnit 27 One-Way Analysis of Variance
Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationn y π y (1 π) n y +ylogπ +(n y)log(1 π).
Tests for a binomial probability π Let Y bin(n,π). The likelihood is L(π) = n y π y (1 π) n y and the log-likelihood is L(π) = log n y +ylogπ +(n y)log(1 π). So L (π) = y π n y 1 π. 1 Solving for π gives
More informationCreated by T. Madas. Candidates may use any calculator allowed by the regulations of this examination.
IYGB GCE Mathematics MMS Advanced Level Practice Paper Q Difficulty Rating: 3.400/0.6993 Time: 3 hours Candidates may use any calculator allowed by the regulations of this examination. Information for
More informationSampling Distributions: Central Limit Theorem
Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)
More informationChapter 23. Inference About Means
Chapter 23 Inference About Means 1 /57 Homework p554 2, 4, 9, 10, 13, 15, 17, 33, 34 2 /57 Objective Students test null and alternate hypotheses about a population mean. 3 /57 Here We Go Again Now that
More informationElementary Statistics
Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:
More informationPump failure data. Pump Failures Time
Outline 1. Poisson distribution 2. Tests of hypothesis for a single Poisson mean 3. Comparing multiple Poisson means 4. Likelihood equivalence with exponential model Pump failure data Pump 1 2 3 4 5 Failures
More informationSTAT 536: Genetic Statistics
STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,
More informationFinal Exam. Name: Solution:
Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationMAT Mathematics in Today's World
MAT 1000 Mathematics in Today's World Last Time 1. Three keys to summarize a collection of data: shape, center, spread. 2. Can measure spread with the fivenumber summary. 3. The five-number summary can
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationQuiz 1. Name: Instructions: Closed book, notes, and no electronic devices.
Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the
More informationLecture 10: Generalized likelihood ratio test
Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual
More informationHomework Example Chapter 1 Similar to Problem #14
Chapter 1 Similar to Problem #14 Given a sample of n = 129 observations of shower-flow-rate, do this: a.) Construct a stem-and-leaf display of the data. b.) What is a typical, or representative flow rate?
More informationWeek 4 Concept of Probability
Week 4 Concept of Probability Mudrik Alaydrus Faculty of Computer Sciences University of Mercu Buana, Jakarta mudrikalaydrus@yahoo.com 1 Introduction : A Speech hrecognition i System computer communication
More informationRobustness and Distribution Assumptions
Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV
Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1 4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions
More informationCS 5014: Research Methods in Computer Science. Bernoulli Distribution. Binomial Distribution. Poisson Distribution. Clifford A. Shaffer.
Department of Computer Science Virginia Tech Blacksburg, Virginia Copyright c 2015 by Clifford A. Shaffer Computer Science Title page Computer Science Clifford A. Shaffer Fall 2015 Clifford A. Shaffer
More informationSTATISTICS 141 Final Review
STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /
More informationReview for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling
Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included
More informationTopic 8. Data Transformations [ST&D section 9.16]
Topic 8. Data Transformations [ST&D section 9.16] 8.1 The assumptions of ANOVA For ANOVA, the linear model for the RCBD is: Y ij = µ + τ i + β j + ε ij There are four key assumptions implicit in this model.
More informationIntroduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33
Introduction 1 STA442/2101 Fall 2016 1 See last slide for copyright information. 1 / 33 Background Reading Optional Chapter 1 of Linear models with R Chapter 1 of Davison s Statistical models: Data, and
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More informationChapter 6 The Standard Deviation as a Ruler and the Normal Model
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread
More informationAdditional Problems Additional Problem 1 Like the http://www.stat.umn.edu/geyer/5102/examp/rlike.html#lmax example of maximum likelihood done by computer except instead of the gamma shape model, we will
More informationConfidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean
Confidence Intervals Confidence interval for sample mean The CLT tells us: as the sample size n increases, the sample mean is approximately Normal with mean and standard deviation Thus, we have a standard
More informationStatistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing
Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing So, What is Statistics? Theory and techniques for learning from data How to collect How to analyze How to interpret
More informationSTAT2201 Assignment 3 Semester 1, 2017 Due 13/4/2017
Class Example 1. Single Sample Descriptive Statistics (a) Summary Statistics and Box-Plots You are working in factory producing hand held bicycle pumps and obtain a sample of 174 bicycle pump weights in
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationWhat is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected
What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types
More informationIV. The Normal Distribution
IV. The Normal Distribution The normal distribution (a.k.a., the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More informationPart 7: Glossary Overview
Part 7: Glossary Overview In this Part This Part covers the following topic Topic See Page 7-1-1 Introduction This section provides an alphabetical list of all the terms used in a STEPS surveillance with
More informationProbability Distributions for Continuous Variables. Probability Distributions for Continuous Variables
Probability Distributions for Continuous Variables Probability Distributions for Continuous Variables Let X = lake depth at a randomly chosen point on lake surface If we draw the histogram so that the
More informationA C E. Answers Investigation 4. Applications
Answers Applications 1. 1 student 2. You can use the histogram with 5-minute intervals to determine the number of students that spend at least 15 minutes traveling to school. To find the number of students,
More informationF79SM STATISTICAL METHODS
F79SM STATISTICAL METHODS SUMMARY NOTES 9 Hypothesis testing 9.1 Introduction As before we have a random sample x of size n of a population r.v. X with pdf/pf f(x;θ). The distribution we assign to X is
More informationSTAT 6350 Analysis of Lifetime Data. Probability Plotting
STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationStatistical Data Analysis Stat 3: p-values, parameter estimation
Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,
More informationIntroduction to Statistical Data Analysis Lecture 4: Sampling
Introduction to Statistical Data Analysis Lecture 4: Sampling James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis 1 / 30 Introduction
More informationBasics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.
Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample
More informationWhat is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.
What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,
More informationCounting principles, including permutations and combinations.
1 Counting principles, including permutations and combinations. The binomial theorem: expansion of a + b n, n ε N. THE PRODUCT RULE If there are m different ways of performing an operation and for each
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationLearning Objectives for Stat 225
Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:
More informationPercentage point z /2
Chapter 8: Statistical Intervals Why? point estimate is not reliable under resampling. Interval Estimates: Bounds that represent an interval of plausible values for a parameter There are three types of
More informationClass 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving
More informationStatistical methods. Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra
Statistical methods Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra Statistical methods Generating random numbers MATLAB has many built-in functions
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationBootstrapping, Permutations, and Monte Carlo Testing
Bootstrapping, Permutations, and Monte Carlo Testing Problem: Population of interest is extremely rare spatially and you are interested in using a 95% CI to estimate total abundance. The sampling design
More informationData Analysis and Uncertainty Part 2: Estimation
Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable
More informationDistribution Fitting (Censored Data)
Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...
More informationStatistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That
Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval
More information13: Additional ANOVA Topics. Post hoc Comparisons
13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA
More informationUsing R in Undergraduate Probability and Mathematical Statistics Courses. Amy G. Froelich Department of Statistics Iowa State University
Using R in Undergraduate Probability and Mathematical Statistics Courses Amy G. Froelich Department of Statistics Iowa State University Undergraduate Probability and Mathematical Statistics at Iowa State
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationFRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE
FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE Course Title: Probability and Statistics (MATH 80) Recommended Textbook(s): Number & Type of Questions: Probability and Statistics for Engineers
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationIndependent Samples ANOVA
Independent Samples ANOVA In this example students were randomly assigned to one of three mnemonics (techniques for improving memory) rehearsal (the control group; simply repeat the words), visual imagery
More informationContents 1. Contents
Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationSampling Variability and Confidence Intervals. John McGready Johns Hopkins University
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More information