ST 501 Course: Fundamentals of Statistical Inference I. Sujit K. Ghosh.

Similar documents
1) (A B) = A B ( ) 2) A B = A. i) A A = φ i j. ii) Additional Important Properties of Sets. De Morgan s Theorems :

The Substring Search Problem

3.1 Random variables

1. Review of Probability.

Auchmuty High School Mathematics Department Advanced Higher Notes Teacher Version

ON INDEPENDENT SETS IN PURELY ATOMIC PROBABILITY SPACES WITH GEOMETRIC DISTRIBUTION. 1. Introduction. 1 r r. r k for every set E A, E \ {0},

Central Coverage Bayes Prediction Intervals for the Generalized Pareto Distribution

Goodness-of-fit for composite hypotheses.

THE NUMBER OF TWO CONSECUTIVE SUCCESSES IN A HOPPE-PÓLYA URN

and the correct answer is D.

Solution to HW 3, Ma 1a Fall 2016

Lecture 28: Convergence of Random Variables and Related Theorems

Chapter 3: Theory of Modular Arithmetic 38

New problems in universal algebraic geometry illustrated by boolean equations

6 PROBABILITY GENERATING FUNCTIONS

arxiv: v1 [math.co] 4 May 2017

Math 301: The Erdős-Stone-Simonovitz Theorem and Extremal Numbers for Bipartite Graphs

Pearson s Chi-Square Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted Histograms

On the ratio of maximum and minimum degree in maximal intersecting families

2 x 8 2 x 2 SKILLS Determine whether the given value is a solution of the. equation. (a) x 2 (b) x 4. (a) x 2 (b) x 4 (a) x 4 (b) x 8

İstanbul Kültür University Faculty of Engineering. MCB1007 Introduction to Probability and Statistics. First Midterm. Fall

Multiple Criteria Secretary Problem: A New Approach

Physics 2020, Spring 2005 Lab 5 page 1 of 8. Lab 5. Magnetism

Review for the previous lecture

4/18/2005. Statistical Learning Theory

763620SS STATISTICAL PHYSICS Solutions 2 Autumn 2012

Chapter 5 Linear Equations: Basic Theory and Practice

Lecture 18: Graph Isomorphisms

MODULE 5a and 5b (Stewart, Sections 12.2, 12.3) INTRO: In MATH 1114 vectors were written either as rows (a1, a2,..., an) or as columns a 1 a. ...

Universal Gravitation

The Millikan Experiment: Determining the Elementary Charge

Permutations and Combinations

On the ratio of maximum and minimum degree in maximal intersecting families

4. Some Applications of first order linear differential

The second law of thermodynamics - II.

Geometry of the homogeneous and isotropic spaces

Exploration of the three-person duel

working pages for Paul Richards class notes; do not copy or circulate without permission from PGR 2004/11/3 10:50

A Bijective Approach to the Permutational Power of a Priority Queue

ac p Answers to questions for The New Introduction to Geographical Economics, 2 nd edition Chapter 3 The core model of geographical economics

Do Managers Do Good With Other People s Money? Online Appendix

Physics 2A Chapter 10 - Moment of Inertia Fall 2018

Nuclear Medicine Physics 02 Oct. 2007

CSCE 478/878 Lecture 4: Experimental Design and Analysis. Stephen Scott. 3 Building a tree on the training set Introduction. Outline.

Failure Probability of 2-within-Consecutive-(2, 2)-out-of-(n, m): F System for Special Values of m

Surveillance Points in High Dimensional Spaces

Random Variables and Probability Distribution Random Variable

When two numbers are written as the product of their prime factors, they are in factored form.

2. Electrostatics. Dr. Rakhesh Singh Kshetrimayum 8/11/ Electromagnetic Field Theory by R. S. Kshetrimayum

Single Particle State AB AB

A Short Combinatorial Proof of Derangement Identity arxiv: v1 [math.co] 13 Nov Introduction

Multiple Experts with Binary Features

Math 151. Rumbos Spring Solutions to Assignment #7

q i i=1 p i ln p i Another measure, which proves a useful benchmark in our analysis, is the chi squared divergence of p, q, which is defined by

On the integration of the equations of hydrodynamics

ASTR415: Problem Set #6

Probability Distribution (Probability Model) Chapter 2 Discrete Distributions. Discrete Random Variable. Random Variable. Why Random Variable?

To Feel a Force Chapter 7 Static equilibrium - torque and friction

221B Lecture Notes Scattering Theory I

Web-based Supplementary Materials for. Controlling False Discoveries in Multidimensional Directional Decisions, with

Likelihood vs. Information in Aligning Biopolymer Sequences. UCSD Technical Report CS Timothy L. Bailey

Physics 121 Hour Exam #5 Solution

Encapsulation theory: radial encapsulation. Edmund Kirwan *

10/04/18. P [P(x)] 1 negl(n).

Macro Theory B. The Permanent Income Hypothesis

Online Mathematics Competition Wednesday, November 30, 2016

Determining solar characteristics using planetary data

arxiv: v1 [math.co] 6 Mar 2008

NOTE. Some New Bounds for Cover-Free Families

Test 2, ECON , Summer 2013

A Converse to Low-Rank Matrix Completion

On a quantity that is analogous to potential and a theorem that relates to it

Section 8.2 Polar Coordinates

Brief summary of functional analysis APPM 5440 Fall 2014 Applied Analysis

A Relativistic Electron in a Coulomb Potential

The geometric construction of Ewald sphere and Bragg condition:

In statistical computations it is desirable to have a simplified system of notation to avoid complicated formulas describing mathematical operations.

Unobserved Correlation in Ascending Auctions: Example And Extensions

F-IF Logistic Growth Model, Abstract Version

The Chromatic Villainy of Complete Multipartite Graphs

Hypothesis Test and Confidence Interval for the Negative Binomial Distribution via Coincidence: A Case for Rare Events

3.6 Applied Optimization

On the Quasi-inverse of a Non-square Matrix: An Infinite Solution

Nuclear and Particle Physics - Lecture 20 The shell model

HOW TO TEACH THE FUNDAMENTALS OF INFORMATION SCIENCE, CODING, DECODING AND NUMBER SYSTEMS?

CALCULATING THE NUMBER OF TWIN PRIMES WITH SPECIFIED DISTANCE BETWEEN THEM BASED ON THE SIMPLEST PROBABILISTIC MODEL

7.2. Coulomb s Law. The Electric Force

arxiv: v1 [math.co] 1 Apr 2011

SUFFICIENT CONDITIONS FOR MAXIMALLY EDGE-CONNECTED AND SUPER-EDGE-CONNECTED GRAPHS DEPENDING ON THE CLIQUE NUMBER

Diffusion and Transport. 10. Friction and the Langevin Equation. Langevin Equation. f d. f ext. f () t f () t. Then Newton s second law is ma f f f t.

Psychometric Methods: Theory into Practice Larry R. Price

Chem 453/544 Fall /08/03. Exam #1 Solutions

COLLAPSING WALLS THEOREM

LET a random variable x follows the two - parameter

C/CS/Phys C191 Shor s order (period) finding algorithm and factoring 11/12/14 Fall 2014 Lecture 22

A Comparison and Contrast of Some Methods for Sample Quartiles

16 Modeling a Language by a Markov Process

PES 3950/PHYS 6950: Homework Assignment 6

Appendix A. Appendices. A.1 ɛ ijk and cross products. Vector Operations: δ ij and ɛ ijk

Euclidean Figures and Solids without Incircles or Inspheres

Transcription:

ST 501 Couse: Fundamentals of Statistical Infeence I Sujit K. Ghosh sujit.ghosh@ncsu.edu Pesented at: 2229 SAS Hall, Depatment of Statistics, NC State Univesity http://www.stat.ncsu.edu/people/ghosh/couses/st501/ Sujit K. Ghosh ST 501 Couse Slides 1 Textbook: Rice A. J. (2007). Mathematical Statistics and Data Analysis, 3d Edition https://www.cengage.com/c/mathematical-statistics-and-data-analysis-3e-ice Outline Chap 1: Pobability Chap 2: Random Vaiables Chap 3: Joint Distibutions Chap 4: Expected Values Chap 5: Limit Theoems Last updated on: August 19, 2018 Sujit K. Ghosh ST 501 Couse Slides 2

Why We Need Pobability Theoy? Conside pedicting closing pices of S&P 500 seies fo the next 10 tading days Suppose we ae only inteested in pedicting whethe the pice will go up (+) o down ( ) fom the last tading day In othe wods, we simply want to pedict the sign of log-etun of the pices (i.e. if log. t =log(p t /P t 1 ) we want to pedict sign(log. t )) Suppose fo the last 5 tading days the signs wee +,, +, +, What can you say about the numbe of + signs fo the next 10 tading days? How much confidence do you have in answeing the above question? What assumptions (if any) have you made to answe the above questions? Clealy, nothing can be answeed with cetainty (and we can only povide pobability) Sujit K. Ghosh ST 501 Couse Slides 3 Let s conside a little moe abstact expeiment with an un containing N balls of two colos: RED (R) and WHITE (W) [Go Wolfpack!] Ou goal is to guess how many RED (R) balls ae in the un Suppose we daw n balls with eplacement and ( n) tuns out to be RED How many RED balls does the un contain? Relating to pevious scenaio, if we designate RED ball as negative sign, we have N =15, n =5and =2 If R denotes the unknown numbe of RED balls in the un, a simple minded stategy is to equate the two popotions: R N = n and guess ˆR = n N How accuate is the above answe? Will you answe change if the n balls wee dawn without eplacement? Will you answe change is you knew the sequence of the colos of the balls? Sujit K. Ghosh ST 501 Couse Slides 4

Pobability Theoy The concept of pobability dates back to 16th centuy (gambling expeiments) Howeve, the axiomatic development of pobability emeged only in the last centuy Pobability theoy has been applied to a wide vaiety of events: Atmospheic sciences: weathe and climate pedictions Business and Finance: insuances, pices, yields, etc. Cosmology: e.g., likelihood of inflation (Gibbons-Hawking-Stewat measue) Defense: complex eliability systems Electonics: compute opeating system Food sciences: agicultual poducts... Zoology: species abundance Sujit K. Ghosh ST 501 Couse Slides 5 The basic notions of pobability is based on computing long tem fequency of events Sample Space: The set of all possible outcomes of an expeiment; and is typically denoted by Ω (i) Finite sample space: Ω={RR, RW, W R, W W } if two balls ae dawn fom an un containing 4 RED balls and 5 WHITE balls (ii) Countably infinite sample space: Ω={0, 1, 2,...} if we count numbe of defectives in a lage consignment (iii) Uncountable sample space: Ω=(0, ) if we measue time until a ca battey goes dead Events: A subset of outcomes of the sample space; and is typically denoted by E (i) E = {RW, W R, W W } Ω: dawing at most one RED ball (ii) E = {3, 4,...} Ω: at least 3 defectives (iii) E =(0, 20.25] Ω: battey dies within 20.25 months Sujit K. Ghosh ST 501 Couse Slides 6

Set theoetic notions of events (sets): Empty set: ; a set with no elements Intesection of events: E 1 E 2 contains elements of both events E 1 and E 2. Moe geneally n i=1e i = E 1 E 2... E n Union of events: E 1 E 2 contains elements of eithe E 1 and/o E 2.Moe geneally n i=1e i = E 1 E 2... E n Complement of an event: E c contains all elements of Ω not in E Disjoint events: E 1 E 2 = two events have no common elements Popeties of set opeations: Commutative law: E 1 E 2 = E 2 E 1 and E 1 E 2 = E 2 E 1 Associative law: (E 1 E 2 ) E 3 = E 1 (E 2 E 3 ) and (E 1 E 2 ) E 3 = E 1 (E 2 E 3 ) Distibutive law: (E 1 E 2 ) E 3 =(E 1 E 2 ) (E 2 E 3 ) Sujit K. Ghosh ST 501 Couse Slides 7 A pobability measue is a eal-valued function P defined on the subsets of sample space (σ-field of Ω) satisfying the following axioms: (i) P(Ω) = 1 (ii) P(E) 0 fo any E Ω (iii) P( i=1e i )= i=1 P(E i) fo any sequence of mutually disjoint events E 1,E 2,...(i.e., E i E j = fo any i j) Popeties of pobability measues: (i) P(E c )=1 P(E) and P( ) =0 (ii) P(E 1 ) P(E 2 ) if E 1 E 2 Ω (iii) P(E 1 E 2 )=P(E 1 )+P(E 2 ) P(E 1 E 2 ) (iv) P( n i=1e i )= n i=1 P(E i Fi 1) c whee F i 1 = i 1 j=0 E j fo i =1, 2,... with E 0 =. It thus follows that P( n i=1e i ) n i=1 P(E i) (v) P( n i=1e i ) n i=1 P(E i) (n 1) Sujit K. Ghosh ST 501 Couse Slides 8

Pobability measue on a finite sample space: Ω={w 1,w 2,...,w m } P(E) = #E #Ω = numbe of elements in E numbe of elements in Ω, whee E Ω An Example: Suppose the NCSU telephone numbe must appea as 513-????. If all sequences of emaining fou digits ae equally likely, what is the pobability that a andomly selected NCSU phone numbe contains seven distinct digits? Hee Ω={fou possible digits among 0, 1, 2,...,9} and hence we have #Ω = 10 10 10 10 = 10 4. Next let E = {fou distinct digits among 0, 2, 4, 6, 7, 8, 9} and hence #E =7 6 5 4 b/c the fist digit could be any of seven emaining digits; second could be any of the emaining six, and so on. Hence, we get P(E) = 7 6 5 4 10 4 =0.084. Thus, we need systematic methods to count without enumeating all possible elements of an event Sujit K. Ghosh ST 501 Couse Slides 9 Some geneal ules of counting using combinatoics If we have successive selections (decisions) with exactly n k choices at the k-th step, a total of k=1 n k = n 1 n 2 n diffeent esults ae possible. If we want to daw a sample of size fom a population of n objects, thee exists (i) n diffeent samples, when sampled with eplacement (ii) (n) = n(n 1)...(n +1)samples, when sampled without eplacement ) = (n) unodeed samples (whee! =( 1)...2.1) (iii) ( n! Examples: Suppose we daw a andom sample of size fom a population of n distinguishable elements: (a) P[no epitition in ou sample] = (n) n (b) P[a fixed element is included] =1 (n 1) n (c) P[a fixed element is included] =1 (n 1) (n) eplacement if sampling with eplacement if sampling with eplacement = n if sampling without Sujit K. Ghosh ST 501 Couse Slides 10

The numbe of ways n objects can be gouped into classes with n k in the k-th class, such that n = k=1 n ( ) n k is n 1 n 2 n = n! n 1! n 2! n! ( n The tem n 1 n 2 n ) is called multinomial coefficient, because it can be shown that (x 1 + x 2 + + x ) n = ( n n 1 n 2 n ) x n 1 1 x n 2 2 x n whee the sum is ove all nonnegative integes n 1,n 2,...,n such that n 1 + n 2 + + n = n How many tems ae thee in the above summation? The numbe of distinct solutions of n 1 + n 2 + + n = n ( ) +n 1 is 1 The above expession is also the numbe of ways n indistinguishable objects can be placed into cells P[none of the cells emain empty] = ( ) ( n 1 1 / +n 1 ) 1 This is equivalent to numbe of distinct solutions, such that n k 1 k Sujit K. Ghosh ST 501 Couse Slides 11 1. Fo any x ( ) x R, = x(x 1) (x +1) (! ) ( ) a 2. Fo any a>0, =( 1) a+ 1 3. Fo any x ( ) ( x R, = x+1 1) + ( x 4. Fo any a, b R, n =0 ( a )( b n whee =0, 1, 2,... [( ) ] x =0if <0 +1),fo =0, 1, 2,... ) ( = a+b ) fo n =1, 2,... 5. (1 + t) a = =0 ( a ) t fo any a R and t ( 1, 1) 6. Using a = 1 and integating above: log(1 + t) = =0 n ( 1) t +1 +1 7. Stiling s fomula: n! 2πn n+ 1 2 e n [a n b n lim n a n bn =1 8. Γ(x) 2πx x 1 2 e x whee Γ(x) = 0 t x 1 e t dt is the Gamma function and Γ(x +1)=xΓ(x) fo x>0. In paticula, Γ(n +1)=n! Online Combination and Pemutation calculato: Click www.mathsisfun.com/combinatoics/combinations-pemutations-calculato.html ] Sujit K. Ghosh ST 501 Couse Slides 12

Example (sign of log-etuns): Conside again dawing n balls fom an un containing N balls with two colos: RED (R) and WHITE (W) Suppose out n balls ae RED. How many RED balls ae thee in the un? Let R = numbe of (unknown) RED balls and so W = N R WHITE balls. (i) Sampling with eplacement: P[ RED balls] = ( n ) R (N R) n N n = ( n R )( N Find R that maximizes the above pobability (ii) Sampling without eplacement: P[ RED balls] = ( n Find R that maximizes the above pobability ) ( ) 1 R n N ) (R) (N R) n (N) n = (R )( N R n ) ( N n) Show that in eithe cases the most likely value of R is given by the ealie simple minded value of ˆR = n N obtained by equating R N = n Notice that the above answe emains the same even if you knew a given ode of the sequence of RED and WHITE balls Sujit K. Ghosh ST 501 Couse Slides 13 sampling with eplacement sampling without eplacement pob.w 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 pob.wo 0.0 0.1 0.2 0.3 0.4 0 5 10 15 R 0 5 10 15 R Figue 1: Compaing two sampling schemes when N =15,n=5and =2 Sujit K. Ghosh ST 501 Couse Slides 14

R codes: N=15; n=5; =2 pob1=function(r){dbinom(,size=n,pob=r/n)} pob2=function(r){dhype(,m=r,n=n-r,k=n)} pa(mfow=c(1,2),cex=0.75) R=0:N; pob.w=pob1(r); pob.wo=pob2(r) plot(r, pob.w, type="b", main="sampling with eplacement") R.hat1=R[which.max(pob.w)] abline(v=r.hat1,col="ed") plot(r, pob.wo, type="b", main="sampling without eplacement") R.hat2=R[which.max(pob.wo)] abline(v=r.hat2,col="ed") What happens when N = 100,n=50and =20? You may use the above lines of codes simply eplacing the fist line with N=100; n=50; =20 Sujit K. Ghosh ST 501 Couse Slides 15 sampling with eplacement sampling without eplacement pob.w 0.00 0.02 0.04 0.06 0.08 0.10 pob.wo 0.00 0.05 0.10 0.15 0 20 40 60 80 100 R 0 20 40 60 80 100 R Figue 2: Compaing two sampling schemes when N =100,n=50and =20 Sujit K. Ghosh ST 501 Couse Slides 16

Conditional Pobability If D and E ae two events (subsets of the sample space Ω), then the pobability of D given E is P[D E] = P(D E), povided P(E) 0 P(E) Multiplication ule: P(D E) =P(E)P(D E) =P(D)P(E D) Moe geneally, P( n i=1e i )=P(E 1 ) n i=2 P(E i i 1 j=1 E j) If we assume the Makov popety P(E i i 1 j=1 E j)=p(e i E i 1 ) fo i =2, 3,...then P( n i=1e i )=P(E 1 ) n i=2 P(E i E i 1 ) Given any sequence of mutually disjoint events E i s patitioning the sample space Ω= n i=1e i, (whee E i E j = fo i j) wehavethelaw of Total Pobability P(E) = n i=1 P(E E i)= n i=1 P(E E i)p(e i ) The law leads to the Bayes ule: P(E j E) = P(E E j)p(e j ) ni=1 P(E E i )P(E i ) Often the above conditional pobability leads to supising esults when the events E j s ae vey ae Sujit K. Ghosh ST 501 Couse Slides 17 An Example: Suppose that a andomly chosen woman has beast cance is 0.1% and suppose a adiology test is 99.9% accuate in detecting beast cance but povides false positive test 1% of the times. What is the pobability that a woman has beast cance given a positive test? Let E 1 = {woman has beast cance} and E = {adilogy test is positive}. Notice that E 2 = {woman doesn t have beast cance} = E1 c. We ae given, P(E 1 )=0.001 = 1 P(E 2 ). Also we ae told, P(E E 1 )=0.999 and P(E E 2 )=0.01. Thus, by Bayes ule it follows: P(E 1 E) = 0.001 0.999 =0.091 0.001 0.999+(1 0.001) 0.01 Most folks (even those in medical pofessions) get supised by this esult that thee is moe than 90% chance the woman may not have a beast cance given a highly accuate positive test What is the pobability of beast cance given thee successive positive tests? It is almost cetain ( 99.9%)!! Sujit K. Ghosh ST 501 Couse Slides 18

The concept of Independent events (not same as disjoint!) Two events E 1 and E 2 ae said to be pobabilistically independent if P(E 1 E 2 )=P(E 1 ) o equivalently, P(E 2 E 1 )=P(E 2 ) Moe succinctly, E 1 and E 2 ae independent if P(E 1 E 2 )=P(E 1 )P(E 2 ) Moe geneally, a collection of n events, E 1,E 2,...,E n ae mutually independent if P( m j=1e ij )= m j=1 P(E i j ) fo all subcollection E i1,e i2,...,e im fo m =2,...,n. Thus, it will equie 2 n n 1 veifications with n events! It is possible that 3 events E 1,E 2,E 3 ae paiwise independent but E 1,E 2,E 3 ae not mutually independent If two events E 1 and E 2 ae disjoint and if P(E 1 )P(E 2 ) > 0, then E 1 and E 2 can t be independent Pactice Poblems fom Chapte 1: 1, 8, 15, 17, 20, 34, 38, 47, 54, 61, 63, 70 & 77 Sujit K. Ghosh ST 501 Couse Slides 19