Statistical Computing Session 4: Random Simulation
|
|
- Tamsin White
- 5 years ago
- Views:
Transcription
1 Statistical Computing Session 4: Random Simulation Paul Eilers & Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center Masters Track Statistical Sciences, 2010
2 The plan for today Short introduction: who am I? Random simulation, what is it good for? The R system of random number generation Tools for presentation and analysis of simulation output Simulation techniques Getting insight in classical procedures by simulation Masters Track Statistical Sciences 2009 Statistical Computing 1
3 Why simulation? Statistics is about random numbers We assume a model for generating the data The model contains parameters We estimate them from the data, using recipes Examples: regression, ANOVA, time series analysis,... Crucial question: how good is a recipe? Classical (frequentist) statistics: confidence intervals Bayesian: combine prior and data to get posterior distribution Testing (mostly for zero values) is a special case Masters Track Statistical Sciences 2009 Statistical Computing 2
4 How to evaluate performance by analysis In some cases exact results can be obtained Especially for normally distributed data Examples: χ 2, t and F distributions (ANOVA) You will encounter them in other courses here In the old days we used tables Nowadays they are built in as functions in statistical software It is always wise to use analytical results Masters Track Statistical Sciences 2009 Statistical Computing 3
5 How to evaluate performance by simulation Simulate data according to the model Apply the recipe Repeat this many times Study the distributions of parameter estimates Confidence intervals follow from them Probabilities of outcomes of tests can be determined Disadvantage: generalization can be hard What if I have more/less observations, larger variances,...? You might have to simulate again for every case Masters Track Statistical Sciences 2009 Statistical Computing 4
6 The R system for generating random numbers For a given distribution you have four choices I use the normal distribution as an example Give me 100 drawings from standard normal: rnorm(100) The density over a vector x: dnorm(x) The cumulative distribution over a vector x: pnorm(x) The quantiles over a vector of probabilities p: qnorm(p) Let s take a better look Masters Track Statistical Sciences 2009 Statistical Computing 5
7 Example: normal density x = seq(-4, 4 by = 0.1); y = dnorm(x); plot(x, y, type = l, col = blue, lwd = 2) y x Masters Track Statistical Sciences 2009 Statistical Computing 6
8 Example: cumulative normal distribution x = seq(-4, 4 by = 0.1); y = pnorm(x); plot(x, y, type = l, col = blue, lwd = 2) y x Masters Track Statistical Sciences 2009 Statistical Computing 7
9 Example: normal quantiles p = seq(0.001, 0.999, by =0.001) y = qnorm(p) plot(p, y, type = l, lwd = 2, col = blue ) y p Masters Track Statistical Sciences 2009 Statistical Computing 8
10 How to look at random numbers The result of rnorm() is a series of numbers There is little use in plotting them individually Instead we use the histogram We divide x-axis in a number of bins In each bin we count the number of observations We plot it as a bar chart Masters Track Statistical Sciences 2009 Statistical Computing 9
11 Example of a histogram y = rnorm(200) hist(y) Histogram of y Frequency y Masters Track Statistical Sciences 2009 Statistical Computing 10
12 Playing with the histogram y = rnorm(200) b = seq(-4, 4, by = 0.2) # More narrow bins hist(y, breaks = b, col = gray, border = white ) Histogram of y Frequency y Masters Track Statistical Sciences 2009 Statistical Computing 11
13 Adding the theoretical normal density y = rnorm(200); b = seq(-4, 4, by = 0.2) hist(y, breaks = b, col = gray, border = white, freq = F) lines(b, dnorm(b), col = blue, lwd = 2) Histogram of y Density y Masters Track Statistical Sciences 2009 Statistical Computing 12
14 A different result every time? Look at the histograms carefully They seem to be based on different numbers That s indeed the case Every call to rnorm() gives a different result You might not like that (and you should not) It makes it hard to compare results Set the random seed to a starting value Like set.seed(123) or set.seed(911) Any integer number you like (see R help for details) Masters Track Statistical Sciences 2009 Statistical Computing 13
15 The empirical cumulative distribution n = 20; y = rnorm(n) p = ((1:n) - 0.5) / n # Empirical probabilities plot(sort(y), p, type = s, col = red ) p sort(y) Masters Track Statistical Sciences 2009 Statistical Computing 14
16 Adding the theoretical cumulative distribution n = 40; y = rnorm(n); p = ((1:n) - 0.5) / n plot(sort(y), p, type = s, col = red ) x = seq(-3, 3, by = 0.1); lines(x, pnorm(x), col = blue, lwd = 2) p sort(y) Masters Track Statistical Sciences 2009 Statistical Computing 15
17 Randomness at work The theoretical distribution fits not impressively This is generally the case for small samples Run these programs many times (Without set.seed(), of course) You will see randomness at work You wil appreciate variation in small samples better Masters Track Statistical Sciences 2009 Statistical Computing 16
18 The Q-Q plot We have our sorted observations And the empirical cumulative probabilities For the latter we can compute normal quantiles (with qnorm()) Plot one against the other We ll program it ourselves That has some educative value, I hope The function qqnorm() is easier to use Plot line, based on mean and variance, with qqplot() Of course, this only compares to the normal distribution Masters Track Statistical Sciences 2009 Statistical Computing 17
19 Q-Q plot from scratch n = 40; y = rnorm(n); p = ((1:n) - 0.5) / n plot(qnorm(p), sort(y), col = red ) sort(y) qnorm(p) Masters Track Statistical Sciences 2009 Statistical Computing 18
20 Q-Q plot the easy way n = 40; y = rnorm(n); qqnorm(y) qqline(y, col = blue, lwd = 2) Normal QQ Plot Sample Quantiles Theoretical Quantiles Masters Track Statistical Sciences 2009 Statistical Computing 19
21 An improvement on the histogram The histogram is a great idea But it is sensitive to the width of the bins And to their positions (midpoints) An improvement is the kernel density estimator It is explained in Chapter 10 of the book We will discuss that later And develop a further improvement Masters Track Statistical Sciences 2009 Statistical Computing 20
22 Comparing histogram and kernels y = rnorm(200); b = seq(-4, 4, by = 0.2) plot(density(y), col = red, lwd = 2) lines(b, dnorm(b), col = blue, lwd = 2) Histogram of y density.default(x = y) Density Density y N = 200 Bandwidth = Masters Track Statistical Sciences 2009 Statistical Computing 21
23 . Part II Masters Track Statistical Sciences 2009 Statistical Computing 22
24 How to generate random numbers The uniform distribution is always the starting point Sum 6 (or more) uniforms to approximate normal distribution This is of limited usefulness Apply a function: -log(runif(100)) gives exponential distribution Inverse transform: compute quantiles of desired distribution (if possible) Acceptance-rejection sampling Combinations of distributions Mixtures Masters Track Statistical Sciences 2009 Statistical Computing 23
25 Sums of uniform random variables The uniform distribution runs from 0 to 1 Mean 0.5, variance 1/12 Sum of n (independent) uniforms has mean n/2 The variance is n/12 Subtract n/2 Divide by n/12 To get standard normal distribution This is an approximation, but a good one Masters Track Statistical Sciences 2009 Statistical Computing 24
26 Illustration of sums Y = matrix(runif(10000 * 10), 10000, 10); y = apply(y, 1, sum) plot(density(y), col = red, lwd = 2) x = seq(0, 10, by = 0.1); lines(x, dnorm(x, 10 * 0.5, sqrt(10 / 12)), density.default(x = y) Density N = Bandwidth = Masters Track Statistical Sciences 2009 Statistical Computing 25
27 Simple transformations Apply a function to uniformly distributed random numbers Example: y = -log(runif(n)) gives exponential distribution Sums of these leads to the gamma distribution Many of such tricks exist See examples in the book Masters Track Statistical Sciences 2009 Statistical Computing 26
28 Inverse transform or quantiles Remember the cumulative distribution: p = F (x) Where F (x) = x f(t)dt And f(x) is the probability density We have that 0 < p < 1 Turn things around: x = F 1 (p) Generate uniformly distributed p Compute x = F 1 (p) Then x will have density f Of course, you need a procedure to invert F Example: qnorm(runif(n)) is equivalent to rnorm(n) Masters Track Statistical Sciences 2009 Statistical Computing 27
29 Illustration of the quantile approach pnorm(x) p p x x x Masters Track Statistical Sciences 2009 Statistical Computing 28
30 A special case: the exponential distribution Density: fx) = e x (positive x) Cumulative distribution: F (x) = 1 e x You can check this by integrating it yourself From p = 1 e x follows x = log(1 p) But if p from uniform distribution, so is 1 p Hence x = log(p) has exponential distribution Masters Track Statistical Sciences 2009 Statistical Computing 29
31 Inverting the cumulative distribution You need a formula for F 1 (p) In many cases an exact formula is not available But over the years very good approximations were found Try?qnorm to see pointers to literature Luckily, R has many built-in procedures Masters Track Statistical Sciences 2009 Statistical Computing 30
32 Acceptance-rejection sampling Suppose you have a formula for f(x), the density But none for F (x), the cumulative distribution Acceptance sampling is a solution Select a domain in which x will fall (almost surely) Let x 1 (x 2 ) be left (right) boundary Let u = maxf(x) (over this domain) Generate y uniformly distributed between x 1 and x 2 Generate z uniformly distributed betweeen 0 and u If z f(y), accept y If not, repeat Masters Track Statistical Sciences 2009 Statistical Computing 31
33 Illustration of acceptance-rejection sampling xlo = -3 xhi = 3 x = seq(xlo, xhi, by = 0.1) f = dnorm(x) plot(x, f, lwd =2, type = l, col = blue ) u = max(f) n = 1000 a = rep(0, n) y = runif(n) * (xhi - xlo) xlo z = runif(n) * u a = 1 * z < dnorm(y) g = c( -, ) points(y, z, pch = g[a 1]) Masters Track Statistical Sciences 2009 Statistical Computing 32
34 The result of acceptance sampling x f Masters Track Statistical Sciences 2009 Statistical Computing 33
35 Combinations of distributions Example: t distribution Let s be sum of n squared standard normal variates The s has χ 2 n distribution (n degrees of freedom) Let x be one standard normal variate Then x/ s has t distribution Ratio of gammas has beta distribution Ratio of normals has Cauchy distribution Ratio of χ 2 has F distribution Masters Track Statistical Sciences 2009 Statistical Computing 34
36 Mixtures Mixtures are a very flexible tool Basic idea: probability p 1 to draw from density f 1 Probability p 2 to draw from density f 2 And so on Typical application: outlier simulation Two normals: f 1 (x) = N(x; µ, σ), f 2 (x) = N(x; µ, 10 σ) Probabilities: p 1 = 0.98, p 2 = 0.02 Chance of 2% to get a large outlier Masters Track Statistical Sciences 2009 Statistical Computing 35
37 The use of random numbers You probably will not often aim for a specific distribution Instead you program a data generating procedure Mimicking a model you are working on You study the distribution of statistical summaries Educational example: ANOVA and the F distribution Simulate from the null hypothesis, 5 groups, 10 obs., in each Compare empirical (n = 5000) distributions (χ 2 and F ) with theory Masters Track Statistical Sciences 2009 Statistical Computing 36
38 ANOVA simulation: examples of simulated data Masters Track Statistical Sciences 2009 Statistical Computing 37
39 ANOVA: simulated sums of squares and ratio Sum of squares within (SSW) ssw Index Sum of squares between (SSB) ssb Index Ratio MSB/MSW msb/msw Index Masters Track Statistical Sciences 2009 Statistical Computing 38
40 ANOVA: histograms and expected values Sum of squares within (SSW) Frequency ssw Sum of squares between (SSB) Frequency ssb F = MSB / MSW Frequency msb/msw Masters Track Statistical Sciences 2009 Statistical Computing 39
41 ANOVA: histograms and theoretical densities Sum of squares within (SSW) Frequency ssw Sum of squares between (SSB) Frequency ssb F = MSB / MSW Frequency msb/msw Masters Track Statistical Sciences 2009 Statistical Computing 40
42 Reading hints Theoretical statistical background: Chapter 2 Random number simulation: Section Kernel density estimation: Section 10.2 Histogram and qqplot: R on-line documentation Masters Track Statistical Sciences 2009 Statistical Computing 41
Robustness and Distribution Assumptions
Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology
More informationMatematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer
Lunds universitet Matematikcentrum Matematisk statistik Matematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer General information on labs During the rst half of the course MASA01 we will have
More informationDisadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means
Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure
More informationf Simulation Example: Simulating p-values of Two Sample Variance Test. Name: Example June 26, 2011 Math Treibergs
Math 3070 1. Treibergs f Simulation Example: Simulating p-values of Two Sample Variance Test. Name: Example June 26, 2011 The t-test is fairly robust with regard to actual distribution of data. But the
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationChi Square Analysis M&M Statistics. Name Period Date
Chi Square Analysis M&M Statistics Name Period Date Have you ever wondered why the package of M&Ms you just bought never seems to have enough of your favorite color? Or, why is it that you always seem
More informationExplore the data. Anja Bråthen Kristoffersen
Explore the data Anja Bråthen Kristoffersen density 0.2 0.4 0.6 0.8 Probability distributions Can be either discrete or continuous (uniform, bernoulli, normal, etc) Defined by a density function, p(x)
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationPrediction problems 3: Validation and Model Checking
Prediction problems 3: Validation and Model Checking Data Science 101 Team May 17, 2018 Outline Validation Why is it important How should we do it? Model checking Checking whether your model is a good
More informationStatistical Inference
Statistical Inference Classical and Bayesian Methods Class 5 AMS-UCSC Tue 24, 2012 Winter 2012. Session 1 (Class 5) AMS-132/206 Tue 24, 2012 1 / 11 Topics Topics We will talk about... 1 Confidence Intervals
More information4.1. Introduction: Comparing Means
4. Analysis of Variance (ANOVA) 4.1. Introduction: Comparing Means Consider the problem of testing H 0 : µ 1 = µ 2 against H 1 : µ 1 µ 2 in two independent samples of two different populations of possibly
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More informationIntroduction to Business Statistics QM 220 Chapter 12
Department of Quantitative Methods & Information Systems Introduction to Business Statistics QM 220 Chapter 12 Dr. Mohammad Zainal 12.1 The F distribution We already covered this topic in Ch. 10 QM-220,
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationGENERALIZED ERROR DISTRIBUTION
CHAPTER 21 GENERALIZED ERROR DISTRIBUTION 21.1 ASSIGNMENT Write R functions for the Generalized Error Distribution, GED. Nelson [1991] introduced the Generalized Error Distribution for modeling GARCH time
More informationprobability George Nicholson and Chris Holmes 31st October 2008
probability George Nicholson and Chris Holmes 31st October 2008 This practical focuses on understanding probabilistic and statistical concepts using simulation and plots in R R. It begins with an introduction
More informationContinuous random variables
Continuous random variables Continuous r.v. s take an uncountably infinite number of possible values. Examples: Heights of people Weights of apples Diameters of bolts Life lengths of light-bulbs We cannot
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationStatistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts
Statistical methods for comparing multiple groups Lecture 7: ANOVA Sandy Eckel seckel@jhsph.edu 30 April 2008 Continuous data: comparing multiple means Analysis of variance Binary data: comparing multiple
More informationRandom Number Generation. CS1538: Introduction to simulations
Random Number Generation CS1538: Introduction to simulations Random Numbers Stochastic simulations require random data True random data cannot come from an algorithm We must obtain it from some process
More informationIntroductory Statistics with R: Simple Inferences for continuous data
Introductory Statistics with R: Simple Inferences for continuous data Statistical Packages STAT 1301 / 2300, Fall 2014 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail: sungkyu@pitt.edu
More informationChapter 5 Formulas Distribution Formula Characteristics Binomial Probability Mass. π is the probability Function
R Program Notes Biostatistics: A Guide to Design, Analysis, and Discovery Second Edition by Ronald N. Forthofer, Eun Sul Lee, Mike Hernandez Chapter 5: Probability Distributions Program Notes Outline Program
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation
More informationUsing R in 200D Luke Sonnet
Using R in 200D Luke Sonnet Contents Working with data frames 1 Working with variables........................................... 1 Analyzing data............................................... 3 Random
More informationModeling Uncertainty in the Earth Sciences Jef Caers Stanford University
Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,
More informationExplore the data. Anja Bråthen Kristoffersen Biomedical Research Group
Explore the data Anja Bråthen Kristoffersen Biomedical Research Group density 0.2 0.4 0.6 0.8 Probability distributions Can be either discrete or continuous (uniform, bernoulli, normal, etc) Defined by
More information2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018
Math 403 - P. & S. III - Dr. McLoughlin - 1 2018 2 Hand-out 2 Dr. M. P. M. M. M c Loughlin Revised 2018 3. Fundamentals 3.1. Preliminaries. Suppose we can produce a random sample of weights of 10 year-olds
More informationProbability Distributions
CONDENSED LESSON 13.1 Probability Distributions In this lesson, you Sketch the graph of the probability distribution for a continuous random variable Find probabilities by finding or approximating areas
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationGenerating Random Numbers
Generating Random Numbers Seungchul Baek Department of Mathematics and Statistics, UMBC STAT 633: Statistical Computing 1 / 67 Introduction Simulation is a very powerful tool for statisticians. It allows
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1
More informationModule 03 Lecture 14 Inferential Statistics ANOVA and TOI
Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module
More information(NRH: Sections 2.6, 2.7, 2.11, 2.12 (at this point in the course the sections will be difficult to follow))
Curriculum, second lecture: Niels Richard Hansen November 23, 2011 NRH: Handout pages 1-13 PD: Pages 55-75 (NRH: Sections 2.6, 2.7, 2.11, 2.12 (at this point in the course the sections will be difficult
More informationProbability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!
Probability theory and inference statistics Dr. Paola Grosso SNE research group p.grosso@uva.nl paola.grosso@os3.nl (preferred) Roadmap Lecture 1: Monday Sep. 22nd Collecting data Presenting data Descriptive
More informationHypothesis Testing. Gordon Erlebacher. Thursday, February 14, 13
Hypothesis Testing Gordon Erlebacher What we have done R basics: - vectors, data frames, - factors, extraction, - logical expressions, scripts, read and writing data files - histograms, plotting Functions
More informationIn a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:
Activity #10: AxS ANOVA (Repeated subjects design) Resources: optimism.sav So far in MATH 300 and 301, we have studied the following hypothesis testing procedures: 1) Binomial test, sign-test, Fisher s
More informationRight-truncated data. STAT474/STAT574 February 7, / 44
Right-truncated data For this data, only individuals for whom the event has occurred by a given date are included in the study. Right truncation can occur in infectious disease studies. Let T i denote
More informationHypothesis Testing in Action: t-tests
Hypothesis Testing in Action: t-tests Mark Muldoon School of Mathematics, University of Manchester Mark Muldoon, January 30, 2007 t-testing - p. 1/31 Overview large Computing t for two : reprise Today
More informationExam 2 Practice Questions, 18.05, Spring 2014
Exam 2 Practice Questions, 18.05, Spring 2014 Note: This is a set of practice problems for exam 2. The actual exam will be much shorter. Within each section we ve arranged the problems roughly in order
More informationStatistical Simulation An Introduction
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Simulation Through Bootstrapping Introduction 1 Introduction When We Don t Need Simulation
More informationHoliday Assignment PS 531
Holiday Assignment PS 531 Prof: Jake Bowers TA: Paul Testa January 27, 2014 Overview Below is a brief assignment for you to complete over the break. It should serve as refresher, covering some of the basic
More informationHYPOTHESIS TESTING: FREQUENTIST APPROACH.
HYPOTHESIS TESTING: FREQUENTIST APPROACH. These notes summarize the lectures on (the frequentist approach to) hypothesis testing. You should be familiar with the standard hypothesis testing from previous
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationStat 451 Homework 05 Solutions 1
Stat 45 Homework 05 Solutions. (a) Suppose we want to simulate X F but F is not available in closedform. We can implement the inverse CDF method by incorporating Newton s method. That is, sample U Unif(0,
More informationWhat Is ANOVA? Comparing Groups. One-way ANOVA. One way ANOVA (the F ratio test)
What Is ANOVA? One-way ANOVA ANOVA ANalysis Of VAriance ANOVA compares the means of several groups. The groups are sometimes called "treatments" First textbook presentation in 95. Group Group σ µ µ σ µ
More informationBiostatistics Quantitative Data
Biostatistics Quantitative Data Descriptive Statistics Statistical Models One-sample and Two-Sample Tests Introduction to SAS-ANALYST T- and Rank-Tests using ANALYST Thomas Scheike Quantitative Data This
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females
More informationStat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS
Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails
More informationGov Univariate Inference II: Interval Estimation and Testing
Gov 2000-5. Univariate Inference II: Interval Estimation and Testing Matthew Blackwell October 13, 2015 1 / 68 Large Sample Confidence Intervals Confidence Intervals Example Hypothesis Tests Hypothesis
More informationEE/CpE 345. Modeling and Simulation. Fall Class 10 November 18, 2002
EE/CpE 345 Modeling and Simulation Class 0 November 8, 2002 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation
More informationOutline. Unit 3: Inferential Statistics for Continuous Data. Outline. Inferential statistics for continuous data. Inferential statistics Preliminaries
Unit 3: Inferential Statistics for Continuous Data Statistics for Linguists with R A SIGIL Course Designed by Marco Baroni 1 and Stefan Evert 1 Center for Mind/Brain Sciences (CIMeC) University of Trento,
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationWhile you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1
While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1 Chapter 12 Analysis of Variance McGraw-Hill, Bluman, 7th ed., Chapter 12 2
More informationSIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS
SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS Zeynep F. EREN DOGU PURPOSE & OVERVIEW Stochastic simulations involve random inputs, so produce random outputs too. The quality of the output is
More informationMetric Predicted Variable on Two Groups
Metric Predicted Variable on Two Groups Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Goals
More informationOpen book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you.
ISQS 5347 Final Exam Spring 2017 Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you. 1. Recall the commute
More informationOne-factor analysis of variance (ANOVA)
One-factor analysis of variance (ANOVA) March 1, 2017 psych10.stanford.edu Announcements / Action Items Schedule update: final R lab moved to Week 10 Optional Survey 5 coming soon, due on Saturday Last
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from
More informationSPRING 2007 EXAM C SOLUTIONS
SPRING 007 EXAM C SOLUTIONS Question #1 The data are already shifted (have had the policy limit and the deductible of 50 applied). The two 350 payments are censored. Thus the likelihood function is L =
More informationSTAT 3022 Spring 2007
Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationStatistical Methods: Introduction, Applications, Histograms, Ch
Outlines Statistical Methods: Introduction, Applications, Histograms, Characteristics November 4, 2004 Outlines Part I: Statistical Methods: Introduction and Applications Part II: Statistical Methods:
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationSTATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic
STATISTICS ANCILLARY SYLLABUS (W.E.F. the session 2014-15) Semester Paper Code Marks Credits Topic 1 ST21012T 70 4 Descriptive Statistics 1 & Probability Theory 1 ST21012P 30 1 Practical- Using Minitab
More informationDistribution Fitting (Censored Data)
Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty
More informationThe Welch-Satterthwaite approximation
The Welch-Satterthwaite approximation Aidan Rocke Charles University of Prague aidanrocke@gmail.com October 19, 2018 Aidan Rocke (CU) Welch-Satterthwaite October 19, 2018 1 / 25 Overview 1 Motivation 2
More informationR STATISTICAL COMPUTING
R STATISTICAL COMPUTING some R Examples Dennis Friday 2 nd and Saturday 3 rd May, 14. Topics covered Vector and Matrix operation. File Operations. Evaluation of Probability Density Functions. Testing of
More informationLAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2
LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 Data Analysis: The mean egg masses (g) of the two different types of eggs may be exactly the same, in which case you may be tempted to accept
More informationE. Santovetti lesson 4 Maximum likelihood Interval estimation
E. Santovetti lesson 4 Maximum likelihood Interval estimation 1 Extended Maximum Likelihood Sometimes the number of total events measurements of the experiment n is not fixed, but, for example, is a Poisson
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationLECTURE 5. Introduction to Econometrics. Hypothesis testing
LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will
More informationFrequentist Statistics and Hypothesis Testing Spring
Frequentist Statistics and Hypothesis Testing 18.05 Spring 2018 http://xkcd.com/539/ Agenda Introduction to the frequentist way of life. What is a statistic? NHST ingredients; rejection regions Simple
More informationBayesian Regression (1/31/13)
STA613/CBB540: Statistical methods in computational biology Bayesian Regression (1/31/13) Lecturer: Barbara Engelhardt Scribe: Amanda Lea 1 Bayesian Paradigm Bayesian methods ask: given that I have observed
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationStatistical techniques for data analysis in Cosmology
Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction
More informationBEGINNING BAYES IN R. Bayes with discrete models
BEGINNING BAYES IN R Bayes with discrete models Beginning Bayes in R Survey on eating out What is your favorite day for eating out? Construct a prior for p Define p: proportion of all students who answer
More informationChapte The McGraw-Hill Companies, Inc. All rights reserved.
er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations
More informationISQS 5349 Final Exam, Spring 2017.
ISQS 5349 Final Exam, Spring 7. Instructions: Put all answers on paper other than this exam. If you do not have paper, some will be provided to you. The exam is OPEN BOOKS, OPEN NOTES, but NO ELECTRONIC
More informationGOV 2001/ 1002/ E-2001 Section 3 Theories of Inference
GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference Solé Prillaman Harvard University February 11, 2015 1 / 48 LOGISTICS Reading Assignment- Unifying Political Methodology chs 2 and 4. Problem Set 3-
More informationCS 361: Probability & Statistics
March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the
More informationLecture 21: October 19
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use
More informationEE/CpE 345. Modeling and Simulation. Fall Class 9
EE/CpE 345 Modeling and Simulation Class 9 208 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation - the behavior
More informationModeling and Performance Analysis with Discrete-Event Simulation
Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 9 Input Modeling Contents Data Collection Identifying the Distribution with Data Parameter Estimation Goodness-of-Fit
More informationStatistical distributions: Synopsis
Statistical distributions: Synopsis Basics of Distributions Special Distributions: Binomial, Exponential, Poisson, Gamma, Chi-Square, F, Extreme-value etc Uniform Distribution Empirical Distributions Quantile
More informationdetermine whether or not this relationship is.
Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationComputer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.
Simulation Discrete-Event System Simulation Chapter 8 Input Modeling Purpose & Overview Input models provide the driving force for a simulation model. The quality of the output is no better than the quality
More informationMathematical Statistics
Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics
More informationTables Table A Table B Table C Table D Table E 675
BMTables.indd Page 675 11/15/11 4:25:16 PM user-s163 Tables Table A Standard Normal Probabilities Table B Random Digits Table C t Distribution Critical Values Table D Chi-square Distribution Critical Values
More informationMedical statistics part I, autumn 2010: One sample test of hypothesis
Medical statistics part I, autumn 2010: One sample test of hypothesis Eirik Skogvoll Consultant/ Professor Faculty of Medicine Dept. of Anaesthesiology and Emergency Medicine 1 What is a hypothesis test?
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The
More informationHandout 4: Simple Linear Regression
Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:
More informationDETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics
DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationUQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables
UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1
More informationStatistics and Data Analysis
Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data
More informationStatistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg
Statistics for Data Analysis PSI Practical Course 2014 Niklaus Berger Physics Institute, University of Heidelberg Overview You are going to perform a data analysis: Compare measured distributions to theoretical
More information