Probability and Statistics. What is probability? What is statistics?

Similar documents
STK3100 and STK4100 Autumn 2017

STK3100 and STK4100 Autumn 2018

Lecture 9. Some Useful Discrete Distributions. Some Useful Discrete Distributions. The observations generated by different experiments have

Entropy, Relative Entropy and Mutual Information

Chapter 5 Properties of a Random Sample

BASIC PRINCIPLES OF STATISTICS

Econometric Methods. Review of Estimation

Random Variables. ECE 313 Probability with Engineering Applications Lecture 8 Professor Ravi K. Iyer University of Illinois

CS 2750 Machine Learning Lecture 5. Density estimation. Density estimation

CHAPTER VI Statistical Analysis of Experimental Data

Summary of the lecture in Biostatistics

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

2. Independence and Bernoulli Trials

Parameter Estimation

Chapter 4 Multiple Random Variables

Continuous Random Variables: Conditioning, Expectation and Independence

Random Variables and Probability Distributions

Lecture Notes Types of economic variables

CHAPTER 6. d. With success = observation greater than 10, x = # of successes = 4, and

ρ < 1 be five real numbers. The

Chapter 5 Properties of a Random Sample

X ε ) = 0, or equivalently, lim

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Lecture 3. Sampling, sampling distributions, and parameter estimation

2SLS Estimates ECON In this case, begin with the assumption that E[ i

ESS Line Fitting

Part I: Background on the Binomial Distribution

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

D KL (P Q) := p i ln p i q i

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

Lecture 3 Probability review (cont d)

CHAPTER 3 POSTERIOR DISTRIBUTIONS

22 Nonparametric Methods.

Simulation Output Analysis

Lecture 3 Naïve Bayes, Maximum Entropy and Text Classification COSI 134

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests. Soccer Goals in European Premier Leagues

Chapter 14 Logistic Regression Models

Parametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Chapter 4 Multiple Random Variables

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Mu Sequences/Series Solutions National Convention 2014

Parameter, Statistic and Random Samples

STATISTICAL INFERENCE

Special Instructions / Useful Data

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Point Estimation: definition of estimators

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Third handout: On the Gini Index

Multiple Choice Test. Chapter Adequacy of Models for Regression

Law of Large Numbers

STK4011 and STK9011 Autumn 2016

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

BIOREPS Problem Set #11 The Evolution of DNA Strands

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

LINEAR REGRESSION ANALYSIS

Set Theory and Probability

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

1 Onto functions and bijections Applications to Counting

MATH 371 Homework assignment 1 August 29, 2013

Chain Rules for Entropy

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Module 7. Lecture 7: Statistical parameter estimation

The expected value of a sum of random variables,, is the sum of the expected values:

(b) By independence, the probability that the string 1011 is received correctly is

TESTS BASED ON MAXIMUM LIKELIHOOD

Generative classification models

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

Qualifying Exam Statistical Theory Problem Solutions August 2005

Artificial Intelligence Learning of decision trees

Factorization of Finite Abelian Groups

Study of Correlation using Bayes Approach under bivariate Distributions

Pr[X (p + t)n] e D KL(p+t p)n.

: At least two means differ SST

MEASURES OF DISPERSION

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Chapter 11 Systematic Sampling

PTAS for Bin-Packing

Formulas and Tables from Beginning Statistics

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Lecture Note to Rice Chapter 8

Continuous Distributions

Simple Linear Regression

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.

ENGI 3423 Simple Linear Regression Page 12-01

STRONG CONSISTENCY FOR SIMPLE LINEAR EV MODEL WITH v/ -MIXING

Chapter 3 Sampling For Proportions and Percentages

Analysis of Variance with Weibull Data

Idea is to sample from a different distribution that picks points in important regions of the sample space. Want ( ) ( ) ( ) E f X = f x g x dx

Algorithms behind the Correlation Setting Window

Lecture 1 Review of Fundamental Statistical Concepts

Multiple Linear Regression Analysis

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura

Transcription:

robablt ad Statstcs What s robablt? What s statstcs?

robablt ad Statstcs robablt Formall defed usg a set of aoms Seeks to determe the lkelhood that a gve evet or observato or measuremet wll or has haeed What s the robablt of throwg a 7 usg two dce? Statstcs Used to aalze the frequec of ast evets Uses a gve samle of data to assess a robablstc model s valdt or determe values of ts arameters After observg several throws of two dce ca I determe whether or ot the are loaded Also deeds o what we mea b robablt

robablt ad Statstcs We erform a eermet to collect a umber of to quarks How do we etract the best value for ts mass? What s the ucertat of our best value? Is our eermet terall cosstet? Is ths value cosstet wth a gve theor whch tself ma cota ucertates? Is ths value cosstet wth other measuremets of the to quark mass? 3

robablt ad Statstcs 4

robablt ad Statstcs etaquark search - how ca ths occur? 003 6.8σ effect 005 o effect 5

robablt Let the samle sace S be the sace of all ossble outcomes of a eermet Let be a ossble outcome The ( foud [d]) f()d f() s called the robablt dest fucto (df) It ma be called f(;θ) sce the df could deed o oe or more arameters θ Ofte we wll wat to determe θ from a set of measuremets Of course must be somewhere so f ( ) d 6

robablt Deftos of mea ad varace are gve terms of eectato values E [] f ( ) d μ V [ ] [ ] E μ σ [] E ( E[] ) 7

robablt Deftos of covarace ad correlato coeffcet V ρ f cov ad the cov σ σ ad so cov [ ] E[ ( μ )( μ )] E[ ] [ ] deedet the E [ ] f ( ) [ ] 0 f ( ) f ( ) f ( ) dd μ μ μ μ 8

9 robablt Error roagato ( ) ( ) [ ] [ ] ( ) ( ) ( ) ( ) [ ] ( ) ( ) [ ] j j j j j V E E V E we fd the TS eadg cov ad ad... wth varables Cosder μ μ σ μ μ μ μ r r r r r r r r r r r r r

0 robablt Ths gves the famlar error roagato formulas for sums (or dffereces) ad roducts (or rato) [ ] [ ] [ ] ( ) [ ] [ ] cov ad for cov we fd for Usg E E V σ σ σ σ σ σ

Uform Dstrbuto Let What s the osto resoluto of a slco or multwre roortoal chamber wth detecto elemets of sace? [] ( ) ( ) [] [] ( ) [ ] [] ( ) ( ) otherwse for 0 ) ; ( α β α β β α β α α β α β β α β α β α d V E E V d d f E β α f

Bomal Dstrbuto Cosder deedet eermets (Beroull trals) Let the outcome of each be ass or fal Let the robablt of ass robablt of But there are f ( ; )!! successes! ( )! ( )!! ( ) ( ) ermuatos for dstgushable objects groug them at a tme

ermutatos Quck revew umber of ermutatos for elemets! Ths cosders each elemet dstgushable But elemets of the frst te are dstgushable so!of the elemets lead to the same stuato Dtto for the remag ( -) Thus accoutg for these rrelevat ermutatos leads to umber of uque ermuatos s elemets of! ( -)!! the secod te 3

4 Bomal Dstrbuto For the mea ad varace we obta (usg small trcks) Ad ote wth the bomal theorem that [] ( ) [] [ ] [] ( ) ( ) E E V f E ; 0 ( ) ( ) f 0 0 ) ; (

Bomal Dstrbuto Bomal df 5

Eamles Bomal Dstrbuto Co fl (/) Dce throw (/6) Brachg rato of uclear ad artcle decas (Br) Detector or trgger effceces (ass or ot ass) Blood grou B or ot blood grou B 6

Bomal Dstrbuto It s baseball seaso! What s the robablt of a 0.300 htter gettg 4 hts oe game? f E V ( 4;40.3) [] [] 4 0.3 4 0.3. 4 0.3 0.7 4! 0.3 4!0! 4 0.84 0.7 0 0.008 Eect. ± 0.84 hts for a 0.300 htter 7

osso Dstrbuto Cosder whe 0 E E V [] v f ( ; v) e! ad oe fds [] v The osso df v [] σ v v s 8

9 osso Dstrbuto ( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) e v e f e e f v v v ad small for large!! ;...!...! for large!!!!! ;

osso df osso Dstrbuto 0

Eamles osso Dstrbuto artcles detected from radoactve decas Sum of two osso rocesses s a osso rocess artcles detected from scatterg of a beam o target wth cross secto σ Cosmc ras observed a tme terval t umber of etres a hstogram b whe data s accumulated over a fed tme terval umber of russa solders kcked to death b horses Ifat mortalt QC/falure rate redctos

osso Dstrbuto Let 0.7! (;) 0.7! (;) 0.35 0! (0;) ad / The 0 0 Let atoms 0 Let 0 0 0 e e e s v /s ~ τ ~

Gaussa Dstrbuto Gaussa dstrbuto Imortat because of the cetral lmt theorem For deedet varables that are dstrbuted accordg to a df the the sum wll have a df that aroaches a Gaussa for large Eamles are almost a measuremet error (eerg resoluto osto resoluto ) E V [ ] [ ] μ σ 3

Gaussa Dstrbuto The famlar Gaussa df s f E V ( ; μ σ ) [] μ [] σ πσ e ( μ) σ 4

Gaussa Dstrbuto Some useful roertes of the Gaussa dstrbuto are ( rage μ±σ) 0.683 ( rage μ±σ) 0.9555 ( rage μ±3σ) 0.9973 ( outsde rage μ±3σ) 0.007 ( outsde rage μ±5σ) 5.70-7 ( rage μ±0.6745σ) 0.5 5

χ Dstrbuto Ch-square dstrbuto / f ( z; ) z / Γ /... s [] [] z E z V The usefuless z deedet ( μ ) follows the σ ( ) the umber of χ of ths df μ σ e z degrees s / dstrbuto wth ( z 0) of that for d.o.f. freedom 6

χ Dstrbuto 7

robablt 8

robablt robablt ca be defed terms of Kolmogorov aoms The robablt s a real-valued fucto defed o subsets AB samle sace S For ever subset If A B ( S) 0 A S ( A) ( A B) ( A) ( B) Ths meas the robablt s a measure whch the measure of the etre samle sace s 0 9

robablt We further defe the codtoal robablt (A B) read (A) gve B ( A B) Baes theorem Usg ( A B) ( A B) ( B) ( A B) B( B A) ( B A) ( A) ( B) 30

robablt For dsjot A ( ) ( ) ( ) B the ( ) A B B A ( B A) ( A) ( B A ) ( A ) Usuall oe treats the A as outcomes of a reeatable eermet A 3

robablt Usuall oe treats the A as outcomes of a reeatable eermet The (A) s usuall assged a value equal to the lmtg frequec of occurrece of A A ( A) lm Called frequetst statstcs But A could also be terreted as hotheses each of whch s true or false The (A) reresets the degree of belef that hothess A s true Called Baesa statstcs 3

Baes Theorem Suose the geeral oulato (dsease) 0.00 (o dsease) 0.999 Suose there s a test to check for the dsease ( dsease) 0.98 (- dsease) 0.0 But also ( o dsease) 0.03 (- o dsease) 0.97 You are tested for the dsease ad t comes back. Should ou be worred? 33

Baes Theorem Al Baes theorem ( dsease ) ( dsease ) ( dsease) ( dsease) ( dsease) ( dsease) ( o dsease) ( o dsease) 0.98 0.00 0.03 0.98 0.00 0.03 0.999 3.% of eole testg ostve have the dsease Your degree of belef about havg the dsease s 3.% 34

Baes Theorem Is athlete A gult of drug dog? Assume a oulato of athletes ths sort (drug) 0.005 (o drug) 0.995 Suose there s a test to check for the drug ( drug) 0.99 (- drug) 0.0 But also ( o drug) 0.004 (- o drug) 0.996 The athlete s tested ostve. Is he/she volved drug dog? 35

36 Baes Theorem Al Baes theorem??? 0.45 0.005 0.99 0.995 0.004 0.995 0.004 ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ad 0.55 0.995 0.004 0.005 0.99 0.005 0.99 ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( o drug drug drug o drug o drug o drug o drug o drug drug o drug o drug drug drug drug drug drug