Practical Statistics

Size: px
Start display at page:

Download "Practical Statistics"

Transcription

1 Practical Statistics Lecture 1 (Nov. 9): - Correlation - Hypothesis Testing Lecture 2 (Nov. 16): - Error Estimation - Bayesian Analysis - Rejecting Outliers Lecture 3 (Nov. 18) - Monte Carlo Modeling - Bootstrap + Jack-knife Lecture 4 (Nov. 30): - Detection Effects - Survival Analysis Lecture 5 (Dec. 2): - Fourier Techniques - Filtering - Unevenly Sampled Data Good Reference: Hogg et al

2 Review: Process of Decision Making Ask a Question Take Data Reduce Data Derive Statistics describing data Reflect on what is needed Probability Distribution Error Analysis Does the Statistic answer your question? No Hypothesis Testing Yes Simulation Publish! 2

3 Review: The Binomial distribution You are observing something that has a probability, p, of occurring in a single observation. You observe it N times. Want chance of obtaining n successes. For one, particular sequence of observations the probability is: P 1 (n) =p n (1 p) N n There are many sequences which yield n successes: N! P (n) = n!(n n)! pn (1 p) N n N = p n (1 p) N n n Mean Np Variance Np(1-p) Often said N choose n

4 Review: Mean and Variance of Distributions Distribution Mean Variance Binomial Np Np(1-p) Poisson µ µ Gaussian µ σ 2 Uniform [a,b) (a+b)/2 (b-a)/12

5 Review: Comparing a data set to a distribution Suppose we have N data points and a model we think describes this with M parameters. Our model: y(x) =y(x, a 1...a M ) An intuitive metric is the distance of each data point from the model. Let's use the square of the difference between data and the model. LS = N (yi y(x i,a 1 a M )) 2 Why is this a reasonable metric for Determining the best fit to the data?

6 Review: Chi-squared The statistic chi-squared is defined as: N χ 2 (y i y(x i )) 2 = Chi-squared is not a unique metric, but is commonly used: Mean: µ χ 2 = ν = N M Variance: σχ 2 =2ν 2 Often, reduced chi-squared is quoted: Mean: Variance: i=1 µ reduced χ 2 =1 σ 2 reduced χ 2 = 2 ν σ 2

7 HW 2, Problem 2 10% of G type stars have detectable RV. How many stars should I observe to determine whether M type stars are similar? 7

8 Exam 1: Problem 2 Detector has digital units of measured flux, and 3 DU measured RMS noise at this level. How many photons does this correspond to? At no-light level, we measure 1 DU of RMS noise. How much noise does this add? 8

9 Correlation Often the first approach to analyzing data is to look for correlations in various parameters. - May or may not be physically motivated. - Understand experimental effects first (be skeptical). - Be careful of subclusters of points. - Correlation is not (necessarily) causation (remain skeptical). 9

10 A mass-separation correlation? 10

11 Are people born early in the year better hockey players? See Outliers book by Malcolm Gladwell 11

12 Correlation coefficient The correlation coefficient for two parameters, x and y, is defined as the covariance between parameters over the scatter in the distribution for each parameter: ρ = covariance(x, y) σ x σ y The correlation coefficient can be estimated directly from the data: r = i (X i <X>)(Y i <Y >) i (X i <X>) 2 i (Y i <Y >) 2 12

13 Probability of correlation For a bivariate Gaussian distribution, Bayes theorem can be used to estimate the probability of correlation: prob(ρ data) (1 ρ2 ) (N 1)/2 (1 ρr) ( ρr N 3/2 n 1/ ) 13

14 What if we see a correlation? It s common (but dangerous!) to just fit a line to the data: Anscombe s quartet illustrates the potential pitfalls of line fitting 14

15 Principle Component Analysis If we have N objects, n measured variables (x_n) for each object then: - We want a minimum number of variables that are independent. - These variables will be linear combinations of the observed variables: i = n a ij x j j=1 The goal is to define the new variables to minimize the residual variance in the data 15

16 Geometrical view of PCA Iterative approach of finding the component with maximum variance. 16

17 PCA manipulation 17

18 Statistics for Hypothesis Testing Hypothesis testing uses some metric to determine whether two data sets, or a data set and a model, are distinct. Typically, the problem is set up so that the hypothesis is that the data sets are consistent (the null hypothesis). A probability is calculated that the value found would be obtained again with another sample. Based on the required level of confidence, the hypothesis is rejected or accepted.

19 Parametric Tests Often, the most intuitive way to understand our data is to choose the parameter of interest (say the mean) and compare it to a model. Alternatively, we might be comparing two data sets by asking whether the differences in a statistic are meaningful. These general tests are called Parametric tests They can use frequentist approaches to accept or reject the hypothesis. They can use Bayesian approaches to calculate probabilities of different results. 19

20 Are two data sets drawn from the same distribution? The t statistic quantifies the likelihood that the means are the same. The F statistic quantifies the likelihood that the variances of two data sets are the same. Consider two data sets, x and y, with m and n data points: t = x y s 1/m +1/n F = (xi x) 2 /(n 1) (yi y) 2 /(m 1) s 2 = ns x + ms y n + m S x = (xi x) 2 n

21 Student's t test Calculate the t statistic. A perfect agreement is t=0. Evaluate the probability for t>value. ν = m + n 2 t = x y s 1/m +1/n s 2 = ns x + ms y n + m

22 F test Calculate the F statistic. F = (xi x) 2 /(n 1) (yi y) 2 /(m 1) Calculate the probability that F>value.

23 Non-Parametric Tests If we don t know the underlying distribution, or have small number statistics, there are still tests that can be used to accept or reject a hypothesis. Non-parametric tests still make some assumption about the data: Usually this is something related to the data following counting statistics, or the binomial distribution (randomness assumed, in the appropriate form) 23

24 Chi-squared test The chi-squared statistic can be used to compare any model to a data set: χ 2 = N i=1 (E i O i ) 2 E i Assumes variation in data is due to counting statistics Data must be binned so that E_i is reasonable for the model 24

25 The Kolmogorov-Smirnov Test Calculate the cumulative distribution function for your model (C_model(x)). Calculate the cumulative distribution function for your data(c_data(x). Find maximum of Cmodel(x)-Cdata(x) The variables, x, must be continuous to use K-S test. Don t need to bin the data.

26 K-S test example

27 Assignment: Test your toolbox Download Matlab (or use another tool for this) Download plot data set at: - Familiarize yourself with plotting data, error bars, etc. (This data set will be the basis of HW 7) 27

28 Matlab download go to: - Follow instructions to download and install. - Make sure to use an you@ .arizona.edu for Mathworks registration. 28

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical

More information

AST 418/518 Instrumentation and Statistics

AST 418/518 Instrumentation and Statistics AST 418/518 Instrumentation and Statistics Class Website: http://ircamera.as.arizona.edu/astr_518 Class Texts: Practical Statistics for Astronomers, J.V. Wall, and C.R. Jenkins Measuring the Universe,

More information

Statistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.

Statistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements. Statistics notes Introductory comments These notes provide a summary or cheat sheet covering some basic statistical recipes and methods. These will be discussed in more detail in the lectures! What is

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions ASTR 511/O Connell Lec 6 1 STATISTICS OF OBSERVATIONS & SAMPLING THEORY References: Bevington Data Reduction & Error Analysis for the Physical Sciences LLM: Appendix B Warning: the introductory literature

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Introduction to Statistical Methods for High Energy Physics

Introduction to Statistical Methods for High Energy Physics Introduction to Statistical Methods for High Energy Physics 2011 CERN Summer Student Lectures Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Data modelling Parameter estimation

Data modelling Parameter estimation ASTR509-10 Data modelling Parameter estimation Pierre-Simon, Marquis de Laplace 1749-1827 Under Napoleon, Laplace was a member, then chancellor, of the Senate, and received the Legion of Honour in 1805.

More information

Statistics and Data Analysis

Statistics and Data Analysis Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

My data doesn t look like that..

My data doesn t look like that.. Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Physics 6720 Introduction to Statistics April 4, 2017

Physics 6720 Introduction to Statistics April 4, 2017 Physics 6720 Introduction to Statistics April 4, 2017 1 Statistics of Counting Often an experiment yields a result that can be classified according to a set of discrete events, giving rise to an integer

More information

BERTINORO 2 (JVW) Yet more probability Bayes' Theorem* Monte Carlo! *The Reverend Thomas Bayes

BERTINORO 2 (JVW) Yet more probability Bayes' Theorem* Monte Carlo! *The Reverend Thomas Bayes BERTINORO 2 (JVW) Yet more probability Bayes' Theorem* Monte Carlo! *The Reverend Thomas Bayes 1702-61 1 The Power-law (Scale-free) Distribution N(>L) = K L (γ+1) (integral form) dn = (γ+1) K L γ dl (differential

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

Deciding, Estimating, Computing, Checking

Deciding, Estimating, Computing, Checking Deciding, Estimating, Computing, Checking How are Bayesian posteriors used, computed and validated? Fundamentalist Bayes: The posterior is ALL knowledge you have about the state Use in decision making:

More information

Deciding, Estimating, Computing, Checking. How are Bayesian posteriors used, computed and validated?

Deciding, Estimating, Computing, Checking. How are Bayesian posteriors used, computed and validated? Deciding, Estimating, Computing, Checking How are Bayesian posteriors used, computed and validated? Fundamentalist Bayes: The posterior is ALL knowledge you have about the state Use in decision making:

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Introduction to Statistics and Error Analysis II

Introduction to Statistics and Error Analysis II Introduction to Statistics and Error Analysis II Physics116C, 4/14/06 D. Pellett References: Data Reduction and Error Analysis for the Physical Sciences by Bevington and Robinson Particle Data Group notes

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. It is Time for Homework! ( ω `) First homework + data will be posted on the website, under the homework tab. And

More information

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Fourier and Stats / Astro Stats and Measurement : Stats Notes Fourier and Stats / Astro Stats and Measurement : Stats Notes Andy Lawrence, University of Edinburgh Autumn 2013 1 Probabilities, distributions, and errors Laplace once said Probability theory is nothing

More information

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services

More information

CE 3710: Uncertainty Analysis in Engineering

CE 3710: Uncertainty Analysis in Engineering FINAL EXAM Monday, December 14, 10:15 am 12:15 pm, Chem Sci 101 Open book and open notes. Exam will be cumulative, but emphasis will be on material covered since Exam II Learning Expectations for Final

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Ramesh Yapalparvi It is Time for Homework! ( ω `) First homework + data will be posted on the website, under the homework tab. And also sent out via email. 30% weekly homework.

More information

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Question 1. Suppose you want to estimate the percentage of

More information

Some Statistics. V. Lindberg. May 16, 2007

Some Statistics. V. Lindberg. May 16, 2007 Some Statistics V. Lindberg May 16, 2007 1 Go here for full details An excellent reference written by physicists with sample programs available is Data Reduction and Error Analysis for the Physical Sciences,

More information

Hypothesis testing:power, test statistic CMS:

Hypothesis testing:power, test statistic CMS: Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this

More information

Ch. 1: Data and Distributions

Ch. 1: Data and Distributions Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and

More information

Fundamental Probability and Statistics

Fundamental Probability and Statistics Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses

More information

Lecture 10: Generalized likelihood ratio test

Lecture 10: Generalized likelihood ratio test Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual

More information

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO LECTURE NOTES FYS 4550/FYS9550 - EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I PROBABILITY AND STATISTICS A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO Before embarking on the concept

More information

CSE 312 Final Review: Section AA

CSE 312 Final Review: Section AA CSE 312 TAs December 8, 2011 General Information General Information Comprehensive Midterm General Information Comprehensive Midterm Heavily weighted toward material after the midterm Pre-Midterm Material

More information

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1 Lecture 2 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008

1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008 1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008 1.1 Introduction Why do we need statistic? Wall and Jenkins (2003) give a good description of the scientific analysis

More information

18.05 Practice Final Exam

18.05 Practice Final Exam No calculators. 18.05 Practice Final Exam Number of problems 16 concept questions, 16 problems. Simplifying expressions Unless asked to explicitly, you don t need to simplify complicated expressions. For

More information

Statistics Introductory Correlation

Statistics Introductory Correlation Statistics Introductory Correlation Session 10 oscardavid.barrerarodriguez@sciencespo.fr April 9, 2018 Outline 1 Statistics are not used only to describe central tendency and variability for a single variable.

More information

Overview of Spatial analysis in ecology

Overview of Spatial analysis in ecology Spatial Point Patterns & Complete Spatial Randomness - II Geog 0C Introduction to Spatial Data Analysis Chris Funk Lecture 8 Overview of Spatial analysis in ecology st step in understanding ecological

More information

Part III: Unstructured Data

Part III: Unstructured Data Inf1-DA 2010 2011 III: 51 / 89 Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval Statistical Analysis of Data: III.2 Data scales and summary statistics III.3 Hypothesis

More information

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Primer on statistics:

Primer on statistics: Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Simulation. Where real stuff starts

Simulation. Where real stuff starts 1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Statistical techniques for data analysis in Cosmology

Statistical techniques for data analysis in Cosmology Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction

More information

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: )

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: ) Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv:1809.05778) Workshop on Advanced Statistics for Physics Discovery aspd.stat.unipd.it Department of Statistical Sciences, University of

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.

More information

Hypothesis testing. 1 Principle of hypothesis testing 2

Hypothesis testing. 1 Principle of hypothesis testing 2 Hypothesis testing Contents 1 Principle of hypothesis testing One sample tests 3.1 Tests on Mean of a Normal distribution..................... 3. Tests on Variance of a Normal distribution....................

More information

Math 562 Homework 1 August 29, 2006 Dr. Ron Sahoo

Math 562 Homework 1 August 29, 2006 Dr. Ron Sahoo Math 56 Homework August 9, 006 Dr. Ron Sahoo He who labors diligently need never despair; for all things are accomplished by diligence and labor. Menander of Athens Direction: This homework worths 60 points

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

y n 1 ( x i x )( y y i n 1 i y 2

y n 1 ( x i x )( y y i n 1 i y 2 STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

M & M Project. Think! Crunch those numbers! Answer!

M & M Project. Think! Crunch those numbers! Answer! M & M Project Think! Crunch those numbers! Answer! Chapters 1-2 Exploring Data and Describing Location in a Distribution Univariate Data: Length Stemplot and Frequency Table Stem (Units Digit) 0 1 1 Leaf

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Probability Density Functions

Probability Density Functions Statistical Methods in Particle Physics / WS 13 Lecture II Probability Density Functions Niklaus Berger Physics Institute, University of Heidelberg Recap of Lecture I: Kolmogorov Axioms Ingredients: Set

More information

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006 Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

Math 50: Final. 1. [13 points] It was found that 35 out of 300 famous people have the star sign Sagittarius.

Math 50: Final. 1. [13 points] It was found that 35 out of 300 famous people have the star sign Sagittarius. Math 50: Final 180 minutes, 140 points. No algebra-capable calculators. Try to use your calculator only at the end of your calculation, and show working/reasoning. Please do look up z, t, χ 2 values for

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

Modeling and Performance Analysis with Discrete-Event Simulation

Modeling and Performance Analysis with Discrete-Event Simulation Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 9 Input Modeling Contents Data Collection Identifying the Distribution with Data Parameter Estimation Goodness-of-Fit

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679

APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679 APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 1 Table I Summary of Common Probability Distributions 2 Table II Cumulative Standard Normal Distribution Table III Percentage Points, 2 of the Chi-Squared

More information

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic STATISTICS ANCILLARY SYLLABUS (W.E.F. the session 2014-15) Semester Paper Code Marks Credits Topic 1 ST21012T 70 4 Descriptive Statistics 1 & Probability Theory 1 ST21012P 30 1 Practical- Using Minitab

More information

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg Statistics for Data Analysis PSI Practical Course 2014 Niklaus Berger Physics Institute, University of Heidelberg Overview You are going to perform a data analysis: Compare measured distributions to theoretical

More information

Probability and Statistics. Joyeeta Dutta-Moscato June 29, 2015

Probability and Statistics. Joyeeta Dutta-Moscato June 29, 2015 Probability and Statistics Joyeeta Dutta-Moscato June 29, 2015 Terms and concepts Sample vs population Central tendency: Mean, median, mode Variance, standard deviation Normal distribution Cumulative distribution

More information

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours

More information

Overview. Confidence Intervals Sampling and Opinion Polls Error Correcting Codes Number of Pet Unicorns in Ireland

Overview. Confidence Intervals Sampling and Opinion Polls Error Correcting Codes Number of Pet Unicorns in Ireland Overview Confidence Intervals Sampling and Opinion Polls Error Correcting Codes Number of Pet Unicorns in Ireland Confidence Intervals When a random variable lies in an interval a X b with a specified

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

AST 418/518 Instrumentation and Statistics

AST 418/518 Instrumentation and Statistics AST 418/518 Instrumentation and Statistics Cass Website: http://ircamera.as.arizona.edu/astr_518 Cass Texts: Practica Statistics for Astronomers, J.V. Wa, and C.R. Jenkins, Second Edition. Measuring the

More information

Physics 509: Non-Parametric Statistics and Correlation Testing

Physics 509: Non-Parametric Statistics and Correlation Testing Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests

More information