Stat 139 Homework 2 Solutions, Spring 2015
|
|
- Evangeline Lambert
- 6 years ago
- Views:
Transcription
1 Stat 139 Homework 2 Solutions, Spring 2015 Problem 1. A pharmaceutical company is surveying through 50 different targeted compounds to try to determine whether any of them may be useful in treating migraine headaches. From previous experiments like this, they believe that each compound independently has a 1/100 chance of truly being effective, and 99/100 chance of having zero effect. For each potential compound, they perform a hypothesis test to determine whether it is effective at α = An effective drug will be statistically significant based on this hypothesis test 80% of the time (which is called statistical power. (a What is the expected number of compounds that will be shown to be statistically significant based on these fifty separate hypothesis tests? Let E i be the event that the i th compound is trule effective, and let R i be the event that it reject the null hypothesis (aka, statistically significant. For any one compound, the probability of being statistically significant can be calculated based on the Law of Total Probability: P (R i = P (R i E i + P (R i E C i = P (R i E i P (E i + P (R i E C i P (E C i = (0.80( (0.05(0.99 = Let X i be the indicator r.v. for whether the i th compound is statistically significant, and let T be the total number of compounds that are shown to be statistically significant. Based on linearity of expectation, the expected number becomes: E(T = E(X 1 + X X 50 = E(X 1 + E(X E(X 50 = 50( = (b Given a compound is flagged as statistically significant, what is the probability that it is actually effective in treating migraine headaches? Here we want to determine P (E i R i, which we can use Bayes Rule to calculate (since we are flipping which event is conditioned on: P (E i R i = P (R i E i P (E i P (R i E i P (E i + P (R i E C i P (EC i = (0.80(0.01 (0.80( (0.05( (c After testing the 50 potential compounds, the company has exactly 1 compound that was deemed to be statistically significant based on the tests. Let π be the probability that it is actually effective in treating migraine headaches. How does π compare to your result in part (b? Explain briefly. It depends on whether you think the characteristics of the testing and the compounds are fixed given in the problem statement. Intuitively speaking, if you see fewer compounds as significant than expected, then there is a good chance that the assumption of 1/100 truly effective is an overestimate, so one could argue that π should be even lower than If you think the 1/100 probability is truly a known and fixed value, then this probability should be the same as part (b. The key is that the characteristics of the test and the compounds have not changed, so for any selected compound, the probability should still be that it is truly effective, no matter how many statistically significant compounds you find out of the 50 compounds tested. Problem 2. ACT scores of high school seniors. The scores of high school seniors on the ACT college entrance examination in a recent year had mean µ = 19.2 and standard deviation σ = 5.1. The distribution of individual scores is only roughly Normal. 1
2 (a What is the approximate probability that a single student randomly chosen from all those taking the test scores 23 or higher? Let X be the test score of a single student. Then: ( P (X 23 P Z > = P (Z > = (b Now take an SRS of 25 students who took the test. What would be the mean and standard deviation of the sampling distribution of the sample mean score, X, for n = 25 students? Based on the Law of Large Numbers, E( X = µ and Var( X = σ2 n, where µ and σ are the mean and standard deviation for a single student. Then E( X = 19.2 and SD( X = Var ( X = = 1.02 in this problem. (c What is the approximate probability that the mean scores of these 25 students is 23 or higher? By the Central Limit Theorem, we know that X is approximately Normal with the mean and standard deviation found in part (b. Then: ( P ( X P Z > = P (Z > 3.73 < (d Which of your two Normal probability calculations in parts (a and (c is more accurate? Why? The distribution of single-student scores is only roughly Normal (it is very discretized after all since individual ACT scores can only be whole numbers, but the sampling of the distribution of X is closer to Normal (although still approximate by the CLT (and can be fractions of 1/25. So we believe that the calculation in part (c is more accurate. Problem 3. The sum of squares of a sample of data is minimized when the sample mean, X = Xi /n, is used as the basis of the calculation. Define g(c as a function w.r.t. c as: g(c = (X i c 2. Show that this function is minimized at the value c = X. In order to minimize a function, we have to take the first derivative (w.r.t. c and set to zero. Then we can take the second derivative and make sure it is positive at x (concave up: g (c = 2 (x i c 0 = c = x i = n c = x i = c = x i /n = x g (c = 2 1 = 2n > 0 Problem 4. Let X 1,..., X i,..., X n be independent random variables drawn from a population with mean µ and variance σ 2. Let X be a sample average. Recall that σ 2 can be estimated by S 2, the usual sample variance, defined as: n S 2 = (X i X ( 2 = 1 Xi 2 n n 1 n 1 X 2. 2
3 (a Show that E(X 2 i = σ2 + µ 2, using the fact that σ 2 = E ( (X i µ 2. E(X 2 i = E(X 2 i 2µ 2 + 2µ 2 = E(X 2 i 2µX i + µ 2 + E(µ 2 = E ( (X i µ 2 + E(µ 2 = σ 2 + µ 2 Note: E(µ 2 = µe(µ = µe(x i = E(µX i. (b Show that E(S 2 = σ 2, i.e., S 2 is an unbiased estimator of the population variance. [ ( ] ( E(S 2 1 = E Xi 2 n n 1 X 2 = 1 E(Xi 2 ne( n 1 X 2 = 1 n 1 ( n(σ 2 + µ 2 n(σ 2 /n + µ 2 Note: E( X 2 = σ 2 X + µ 2 X = σ 2 /n + µ 2 based on the Law of Large Numbers. Problem 5. Let X 1, X 2,..., X 25 be i.i.d. Normal r.vt].s. with mean µ = 1 and variance σ 2 = 3 2 = 9. Let s 2 be the usual variance estimate: S 2 = (X i X 2 /(n 1, and let ˆσ 2 be the estimate using µ in the calculation instead: ˆσ 2 = (X i µ 2 /n. Write a simulation in R, using a for-loop based on at least 10,000 iterations, to determine the following (be sure to include the relevant R code and output: (a That both estimators (S 2 and ˆσ 2 are unbiased. Based on 10,000 iterations, the observed means of both estimators were within 0.01 units of the true variance of 9. We could formally test if the is significantly different from 9 (based on n = 10, 000 realizations, but that is overkill. Here is the relevant R code: > nsims=10000 > mu=1 > sigma=3 > n=25 > sigma2.hat=s2=rep(na,nsims > > for(i in 1:nsims{ + sample=rnorm(n,mean=mu,sd=sigma + xbar=mean(sample + sigma2.hat[i]=sum((sample-mu^2/n + s2[i]=var(sample + } > mean(sigma2.hat [1] > mean(s2 [1] (b Provide a separate histogram for each of the two sampling distributions. Which has lower spread? Based on the R output below, ˆσ 2 has slightly smaller spread than S 2 (about 3% lower standard deviation. 3
4 > sd(sigma2.hat [1] > sd(s2 [1] Histogram of sigma2.hat Histogram of s2 Frequency Frequency sigma2.hat s2 (c Which estimator is closer to the true value more often. Based on the R output below, ˆσ 2 is as close or closer than S 2 about 52.4% of the time. > mean(abs(sigma2.hat-sigma^2>abs(s2-sigma^2 [1] (d Are you sure of your answers above? What could you do to be more certain? No, I am not certain of the answers above since these are based on random simulations. We could be more certain if we based this study on more iterations, or if we performed a formal test to see if the results above were statistically significant. Problem 6. The BOSsnowfall.csv data set on the course website has weather measurements made at Logan Airport. There are two variables in this data set measured annually from winter until winter : totalsnow: the total amount of snow fall for a winter season, in inches avgmaxtemp: the average daily high temperature for the previous calendar year, in degrees F (a Calculate the following summary statistics for both the totalsnow and avgmaxtemp variables: sample mean, sample SD, min, median, max, 1st and 3rd quartiles. > summary(snow season totalsnow avgmaxtemp : 1 Min. : 9.00 Min. : : 1 1st Qu.: st Qu.: : 1 Median : Median :
5 : 1 Mean : Mean : : 1 3rd Qu.: rd Qu.: : 1 Max. : Max. :61.27 (Other:85 > sd(snow$totalsnow [1] > sd(snow$avgmaxtemp [1] (b Split the observations into two groups: the winters with avgmaxtemp at or below the 3rd quartile, as calculated in part (a, vs. the winters above the 3rd quartile. Plot side-by-side boxplots of totalsnow for the two groups and describe the shapes of their distributions. Are there any visible differences? Histogram of meandiff.sim Frequency High Low meandiff.sim The boxplot to the left above shows the annual snowfall for years when the average maximum temperature is in the top quartile, vs. the bottom 75%. Both boxplots appear to be right-skewed. When the temperature is cooler, there appears to be more snowfall, on average. There also seems to be more spread in the cooler group, but this may just be because there are more observations in that group (3-to-1. More details below. (c Comment on whether you think the group means are very different or not (without conducting any formal tests. Based on the side-by-side boxplots above, it appears that the High group (when the average temperature for the year is above 59.63, the 3rd quartile has typically lower amounts of snowfall. The median is lower (the line inside the box, the middle 50% of the distribution (the box is shifted down, and the highest values are lower for the High group as well compared to the Low group. (d Perform a permutation test based on 10,000 iterations to determine whether totalsnow differs between winters where the temperature was at or below the 3rd quartile vs. above the 3rd quartile. Please refer to the Unit 2 lecture notes for useful R code. Be sure to state the hypotheses, calculate the test statistic, produce a histogram of the reference distribution, calculate the p-value based on this distribution, and state the conclusion of the procedure (be sure to mention the scope of the inference. Here is some relevant R output (see HW 2 Solutions R Code.R for the remaining R commands used. 5
6 > meandiff.obs [1] > mean(meandiff.sim [1] > sd(meandiff.sim [1] > #two-sided p-value > mean( abs(meandiff.sim >= abs(meandiff.obs [1] Based on the R ouptput and the histogram above (the reference distribution for the test statistic, we can perform the following Hypothesis Test (a permutation test at the α = 0.05 level, where Y high = Y low + δ: H 0 : δ = 0 vs. H A : δ 0 T = Ȳhigh Ȳlow = p value Since our estimated p-value = , which is two-sided, is less than α = 0.05, we have just enough evidence to conclude that the average snowfall in Boston is different in the two groups; in fact, snowfall tends to be lower in years with high temperature. This is certainly not a causal relationship (no way to randomly assign temperature to years, and this is not a random sample of years, so this does not necessarily mean the trend generalizes outside the years studied or to other locations. 6
M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1
Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)
More informationChapter 18. Sampling Distribution Models. Bin Zou STAT 141 University of Alberta Winter / 10
Chapter 18 Sampling Distribution Models Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 10 Population VS Sample Example 18.1 Suppose a total of 10,000 patients in a hospital and
More informationBusiness Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee
Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)
More informationLecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN
Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and
More informationLecture 8 Sampling Theory
Lecture 8 Sampling Theory Thais Paiva STA 111 - Summer 2013 Term II July 11, 2013 1 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013 Lecture Plan 1 Sampling Distributions 2 Law of Large
More informationStatistic: a that can be from a sample without making use of any unknown. In practice we will use to establish unknown parameters.
Chapter 9: Sampling Distributions 9.1: Sampling Distributions IDEA: How often would a given method of sampling give a correct answer if it was repeated many times? That is, if you took repeated samples
More informationDiscrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20
CS 70 Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 Today we shall discuss a measure of how close a random variable tends to be to its expectation. But first we need to see how to compute
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationDetermining the Spread of a Distribution
Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative
More informationDetermining the Spread of a Distribution
Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative
More informationMath Review Sheet, Fall 2008
1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the
More informationObjective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.
Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The
More informationDiscrete Random Variables
Discrete Random Variables We have a probability space (S, Pr). A random variable is a function X : S V (X ) for some set V (X ). In this discussion, we must have V (X ) is the real numbers X induces a
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationNonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I
1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal
More informationChapter 23: Inferences About Means
Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationX = X X n, + X 2
CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk
More information20 Hypothesis Testing, Part I
20 Hypothesis Testing, Part I Bob has told Alice that the average hourly rate for a lawyer in Virginia is $200 with a standard deviation of $50, but Alice wants to test this claim. If Bob is right, she
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More informationUnit 22: Sampling Distributions
Unit 22: Sampling Distributions Summary of Video If we know an entire population, then we can compute population parameters such as the population mean or standard deviation. However, we generally don
More information10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing
& z-test Lecture Set 11 We have a coin and are trying to determine if it is biased or unbiased What should we assume? Why? Flip coin n = 100 times E(Heads) = 50 Why? Assume we count 53 Heads... What could
More informationClass 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationBootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location
Bootstrap tests Patrick Breheny October 11 Patrick Breheny STA 621: Nonparametric Statistics 1/14 Introduction Conditioning on the observed data to obtain permutation tests is certainly an important idea
More informationChapter 24. Comparing Means
Chapter 4 Comparing Means!1 /34 Homework p579, 5, 7, 8, 10, 11, 17, 31, 3! /34 !3 /34 Objective Students test null and alternate hypothesis about two!4 /34 Plot the Data The intuitive display for comparing
More informationINTRODUCTION TO ANALYSIS OF VARIANCE
CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationLecture 27. DATA 8 Spring Sample Averages. Slides created by John DeNero and Ani Adhikari
DATA 8 Spring 2018 Lecture 27 Sample Averages Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Announcements Questions for This Week How can we quantify natural
More informationOutline. Unit 3: Inferential Statistics for Continuous Data. Outline. Inferential statistics for continuous data. Inferential statistics Preliminaries
Unit 3: Inferential Statistics for Continuous Data Statistics for Linguists with R A SIGIL Course Designed by Marco Baroni 1 and Stefan Evert 1 Center for Mind/Brain Sciences (CIMeC) University of Trento,
More information2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008
MIT OpenCourseWare http://ocw.mit.edu 2.830J / 6.780J / ESD.63J Control of Processes (SMA 6303) Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More informationChapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics
Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely
More informationConfidence Intervals for Comparing Means
Comparison 2 Solutions COR1-GB.1305 Statistics and Data Analysis Confidence Intervals for Comparing Means 1. Recall the class survey. Seventeen female and thirty male students filled out the survey, reporting
More informationProbabilities & Statistics Revision
Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF
More informationConditioning Nonparametric null hypotheses Permutation testing. Permutation tests. Patrick Breheny. October 5. STA 621: Nonparametric Statistics
Permutation tests October 5 The conditioning idea In most hypothesis testing problems, information can be divided into portions that pertain to the hypothesis and portions that do not The usual approach
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationIntroduction to Econometrics. Review of Probability & Statistics
1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical
More informationCHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.
(c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals
More informationWhat is a parameter? What is a statistic? How is one related to the other?
Chapter Seven: SAMPLING DISTRIBUTIONS 7.1 Sampling Distributions Read 424 425 What is a parameter? What is a statistic? How is one related to the other? Example: Identify the population, the parameter,
More information, 0 x < 2. a. Find the probability that the text is checked out for more than half an hour but less than an hour. = (1/2)2
Math 205 Spring 206 Dr. Lily Yen Midterm 2 Show all your work Name: 8 Problem : The library at Capilano University has a copy of Math 205 text on two-hour reserve. Let X denote the amount of time the text
More informationSurvey on Population Mean
MATH 203 Survey on Population Mean Dr. Neal, Spring 2009 The first part of this project is on the analysis of a population mean. You will obtain data on a specific measurement X by performing a random
More informationUniversity of Regina. Lecture Notes. Michael Kozdron
University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating
More informationSome Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2
STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square
More informationSTATISTICS AND BUSINESS MATHEMATICS B.com-1 Private Annual Examination 2015
B.com-1 STATISTICS AND BUSINESS MATHEMATICS B.com-1 Private Annual Examination 2015 Compiled & Solved By: JAHANGEER KHAN (SECTION A) Q.1 (a): Find the distance between the points (1, 2), (4, 5). SOLUTION
More informationChapter 22. Comparing Two Proportions 1 /29
Chapter 22 Comparing Two Proportions 1 /29 Homework p519 2, 4, 12, 13, 15, 17, 18, 19, 24 2 /29 Objective Students test null and alternate hypothesis about two population proportions. 3 /29 Comparing Two
More informationHOMEWORK ANALYSIS #3 - WATER AVAILABILITY (DATA FROM WEISBERG 2014)
HOMEWORK ANALYSIS #3 - WATER AVAILABILITY (DATA FROM WEISBERG 2014) 1. In your own words, summarize the overarching problem and any specific questions that need to be answered using the water data. Discuss
More informationNotes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing
Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing 1. Purpose of statistical inference Statistical inference provides a means of generalizing
More informationStatistical Methods for Astronomy
Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationvalue mean standard deviation
Mr. Murphy AP Statistics 2.4 The Empirical Rule and z - Scores HW Pg. 208 #4.45 (a) - (c), 4.46, 4.51, 4.52, 4.73 Objectives: 1. Calculate a z score. 2. Apply the Empirical Rule when appropriate. 3. Calculate
More informationChapter 8: Sampling Distributions. A survey conducted by the U.S. Census Bureau on a continual basis. Sample
Chapter 8: Sampling Distributions Section 8.1 Distribution of the Sample Mean Frequently, samples are taken from a large population. Example: American Community Survey (ACS) A survey conducted by the U.S.
More information3.1 Measure of Center
3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationLecture 4. August 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
random Lecture 4 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University August 24, 2007 random 1 2 3 4 random 5 6 7 8 9 random 1 Define random 2 and 3 4 Co
More informationPaired Samples. Lecture 37 Sections 11.1, 11.2, Robb T. Koether. Hampden-Sydney College. Mon, Apr 2, 2012
Paired Samples Lecture 37 Sections 11.1, 11.2, 11.3 Robb T. Koether Hampden-Sydney College Mon, Apr 2, 2012 Robb T. Koether (Hampden-Sydney College) Paired Samples Mon, Apr 2, 2012 1 / 17 Outline 1 Dependent
More informationBusiness Statistics: A Decision-Making Approach 6 th Edition. Chapter Goals
Chapter 6 Student Lecture Notes 6-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 6 Introduction to Sampling Distributions Chap 6-1 Chapter Goals To use information from the sample
More informationNonparametric hypothesis tests and permutation tests
Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More informationDover- Sherborn High School Mathematics Curriculum Probability and Statistics
Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationPermutation tests. Patrick Breheny. September 25. Conditioning Nonparametric null hypotheses Permutation testing
Permutation tests Patrick Breheny September 25 Patrick Breheny STA 621: Nonparametric Statistics 1/16 The conditioning idea In many hypothesis testing problems, information can be divided into portions
More informationAn inferential procedure to use sample data to understand a population Procedures
Hypothesis Test An inferential procedure to use sample data to understand a population Procedures Hypotheses, the alpha value, the critical region (z-scores), statistics, conclusion Two types of errors
More informationEC2001 Econometrics 1 Dr. Jose Olmo Room D309
EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:
More informationStat 135 Fall 2013 FINAL EXAM December 18, 2013
Stat 135 Fall 2013 FINAL EXAM December 18, 2013 Name: Person on right SID: Person on left There will be one, double sided, handwritten, 8.5in x 11in page of notes allowed during the exam. The exam is closed
More informationCHAPTER 18 SAMPLING DISTRIBUTION MODELS STAT 203
1 CHAPTER 18 SAMPLING DISTRIBUTION MODELS STAT 203 Outline 2 Sampling Distribution for Proportions Sample Proportions The mean The standard deviation The Distribution Model Assumptions and Conditions Sampling
More informationare the objects described by a set of data. They may be people, animals or things.
( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms
More information2008 Winton. Statistical Testing of RNGs
1 Statistical Testing of RNGs Criteria for Randomness For a sequence of numbers to be considered a sequence of randomly acquired numbers, it must have two basic statistical properties: Uniformly distributed
More informationSTOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3
More informationSTAT100 Elementary Statistics and Probability
STAT100 Elementary Statistics and Probability Exam, Sample Test, Summer 014 Solution Show all work clearly and in order, and circle your final answers. Justify your answers algebraically whenever possible.
More information(Re)introduction to Statistics Dan Lizotte
(Re)introduction to Statistics Dan Lizotte 2017-01-17 Statistics The systematic collection and arrangement of numerical facts or data of any kind; (also) the branch of science or mathematics concerned
More informationFinal Exam Bus 320 Spring 2000 Russell
Name Final Exam Bus 320 Spring 2000 Russell Do not turn over this page until you are told to do so. You will have 3 hours minutes to complete this exam. The exam has a total of 100 points and is divided
More informationCOSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan
COSC 341 Human Computer Interaction Dr. Bowen Hui University of British Columbia Okanagan 1 Last Class Introduced hypothesis testing Core logic behind it Determining results significance in scenario when:
More informationExam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015
Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 18.30 21.15h, February 12, 2015 Question 1 is on this page. Always motivate your answers. Write your answers in English. Only the
More informationLecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
Lecture 8 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University October 22, 2007 1 2 3 4 5 6 1 Define convergent series 2 Define the Law of Large Numbers
More informationStat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.
Stat 260 - Lecture 20 Recap of Last Class Last class we introduced the covariance and correlation between two jointly distributed random variables. Today: We will introduce the idea of a statistic and
More informationST 371 (IX): Theories of Sampling Distributions
ST 371 (IX): Theories of Sampling Distributions 1 Sample, Population, Parameter and Statistic The major use of inferential statistics is to use information from a sample to infer characteristics about
More informationEstimating a population mean
Introductory Statistics Lectures Estimating a population mean Confidence intervals for means Department of Mathematics Pima Community College Redistribution of this material is prohibited without written
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationMath 101: Elementary Statistics Tests of Hypothesis
Tests of Hypothesis Department of Mathematics and Computer Science University of the Philippines Baguio November 15, 2018 Basic Concepts of Statistical Hypothesis Testing A statistical hypothesis is an
More informationUniversity of California, Berkeley, Statistics 131A: Statistical Inference for the Social and Life Sciences. Michael Lugo, Spring 2012
University of California, Berkeley, Statistics 3A: Statistical Inference for the Social and Life Sciences Michael Lugo, Spring 202 Solutions to Exam Friday, March 2, 202. [5: 2+2+] Consider the stemplot
More informationExample continued. Math 425 Intro to Probability Lecture 37. Example continued. Example
continued : Coin tossing Math 425 Intro to Probability Lecture 37 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan April 8, 2009 Consider a Bernoulli trials process with
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationInference for the mean of a population. Testing hypotheses about a single mean (the one sample t-test). The sign test for matched pairs
Stat 528 (Autumn 2008) Inference for the mean of a population (One sample t procedures) Reading: Section 7.1. Inference for the mean of a population. The t distribution for a normal population. Small sample
More informationFinal Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above
King Abdul Aziz University Faculty of Sciences Statistics Department Final Exam STAT 0 First Term 49-430 A 40 Name No ID: Section: You have 40 questions in 9 pages. You have 90 minutes to solve the exam.
More informationWhat Is a Sampling Distribution? DISTINGUISH between a parameter and a statistic
Section 8.1A What Is a Sampling Distribution? Learning Objectives After this section, you should be able to DISTINGUISH between a parameter and a statistic DEFINE sampling distribution DISTINGUISH between
More informationOpen book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you.
ISQS 5347 Final Exam Spring 2017 Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you. 1. Recall the commute
More informationAP Final Review II Exploring Data (20% 30%)
AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More informationMath 10 - Compilation of Sample Exam Questions + Answers
Math 10 - Compilation of Sample Exam Questions + Sample Exam Question 1 We have a population of size N. Let p be the independent probability of a person in the population developing a disease. Answer the
More informationStatistics and Quantitative Analysis U4320. Segment 5: Sampling and inference Prof. Sharyn O Halloran
Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O Halloran Sampling A. Basics 1. Ways to Describe Data Histograms Frequency Tables, etc. 2. Ways to Characterize
More informationIEOR 165 Lecture 7 1 Bias-Variance Tradeoff
IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to
More informationNumerical Measures of Central Tendency
ҧ Numerical Measures of Central Tendency The central tendency of the set of measurements that is, the tendency of the data to cluster, or center, about certain numerical values; usually the Mean, Median
More informationLecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)
Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.
More informationLecture 30. DATA 8 Summer Regression Inference
DATA 8 Summer 2018 Lecture 30 Regression Inference Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Contributions by Fahad Kamran (fhdkmrn@berkeley.edu) and
More information