Introduction to Statistical Inference Self-study

Similar documents
Introduction to Statistical Inference Lecture 8: Linear regression, Tests and confidence intervals

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Lecture 2: Repetition of probability theory and statistics

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Continuous Random Variables

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Algorithms for Uncertainty Quantification

Multivariate distributions

Analysis of Engineering and Scientific Data. Semester

Review: mostly probability and some statistics

Quick Tour of Basic Probability Theory and Linear Algebra

Chapter 4. Continuous Random Variables

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

More on Distribution Function

Bivariate distributions

BASICS OF PROBABILITY

4. Distributions of Functions of Random Variables

Chapter 4 Multiple Random Variables

Joint Distribution of Two or More Random Variables

STA 111: Probability & Statistical Inference

SDS 321: Introduction to Probability and Statistics

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

STAT 430/510: Lecture 10

Bayesian statistics, simulation and software

1 Joint and marginal distributions

Random Variables and Their Distributions

Fundamental Tools - Probability Theory II

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Math438 Actuarial Probability

ECE353: Probability and Random Processes. Lecture 7 -Continuous Random Variable

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Math Review Sheet, Fall 2008

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Introduction to Machine Learning

Measure-theoretic probability

1.1 Review of Probability Theory

1 Presessional Probability

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Probability Review. Chao Lan

Basics of Stochastic Modeling: Part II

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions

Lecture 11. Probability Theory: an Overveiw

Joint Probability Distributions, Correlations

3 Multiple Discrete Random Variables

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

MATH4210 Financial Mathematics ( ) Tutorial 7

2 (Statistics) Random variables

Lecture 1: Review on Probability and Statistics

Formulas for probability theory and linear models SF2941

BMIR Lecture Series on Probability and Statistics Fall 2015 Discrete RVs

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

1 Random Variable: Topics

Chapter 4. Continuous Random Variables 4.1 PDF

4 Pairs of Random Variables

Brief Review of Probability

Continuous Random Variables and Continuous Distributions

Homework 4 Solution, due July 23


p. 6-1 Continuous Random Variables p. 6-2

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Lecture 1: August 28

2 Continuous Random Variables and their Distributions

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

STAT Chapter 5 Continuous Distributions

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Probability Distributions

Recitation 2: Probability

Deep Learning for Computer Vision

Bivariate scatter plots and densities

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

Statistics, Data Analysis, and Simulation SS 2015

A random variable is a quantity whose value is determined by the outcome of an experiment.

1 Probability and Random Variables

Gaussian random variables inr n

3. Probability and Statistics

CME 106: Review Probability theory

Appendix A : Introduction to Probability and stochastic processes

Joint Probability Distributions, Correlations

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

SDS 321: Introduction to Probability and Statistics

Probability and Distributions

Northwestern University Department of Electrical Engineering and Computer Science

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Week 2: Review of probability and statistics

Expectation of Random Variables

Continuous Random Variables. What continuous random variables are and how to use them. I can give a definition of a continuous random variable.

STA 256: Statistics and Probability I

EE 345 MIDTERM 2 Fall 2018 (Time: 1 hour 15 minutes) Total of 100 points

ECE 4400:693 - Information Theory

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Transcription:

Introduction to Statistical Inference Self-study

Contents

Definition, sample space The fundamental object in probability is a nonempty sample space Ω. An event is a subset A Ω.

Definition, σ-algebra A family of subsets of Ω, denoted by F, is a σ-algebra on Ω if 1. F. 2. If A F, then A c F. 3. If A 1, A 2,.., A i,... F, then i A i F.

Definition, probability measure Let Ω be nonempty, and let F be a σ-algebra on Ω. The mapping P : F [0, 1] is a probability measure, if 1. P(A) [0, 1], for all A F, 2. P(Ω) = 1 3. For all A 1, A 2,.., A i,... F, with A j A i =, i j, it holds that P( i A i ) = i P(A i).

Corollaries P(A) = 1 P(A c ). P(B C) = P(B) + P(C) P(B C).

Example Consider rolling two dice. The corresponding sample space Ω = {(1, 1), (1, 2),..., (6, 6)}. The event "both dice > 2" is A = {ω = (ω 1, ω 2 ) Ω ω 1 > 2, ω 2 > 2}. In this example, P({ω}) = 1/36, for all ω Ω.

Definition, conditional probability Let P(B) 0. The probability of an event A given B, P(A B), is the probability of A under the assumption that B has already occurred. Conditional probability, A given B is P(A B) = P(B A). P(B)

Example The probability of getting 3 (event A) when rolling the first dice, given that the other dice gave 4 (event B): P(A B) = P(B A)/P(B) = (1/36)/(6 1/36) = 1/6.

Definition, independence The events A 1,..., A n are independent if for all 1 i 1 < i 2 < < i k n P(A i1... A ik ) = P(A i1 ) P(A ik ).

Example In the dice example P(A B) = 1/36 and on the other hand P(A)P(B) = 1/36, for all A, B, (A, B) Ω.

Definition, random variable A real-valued random variable X is a mapping from the sample spare to the real line, i.e. X = X(ω) : Ω R. More precisely: Let Ω be nonempty and let F be a σ-algebra on Ω. Let X = X(ω) : Ω R be function. If {ω X(ω) r} F for all r R (i.e. X is F measurable), then X is a random variable.

Example, two dice As an example of a random variable, consider the sum: X : {(1, 1),..., (6, 6)} {2,..., 12}, X(ω) = ω 1 + ω 2. Note, however, that the identity function Y (ω 1, ω 2 ) = (ω 1, ω 2 ) also defines a random variable. Since Y : Ω R 2, this random variable is vector valued.

Definition, probability function The probability function of a random variable X, denoted P X, is P X (A) = P({ω : X(ω) A}).

Definition, cumulative distribution function The cumulative distribution function (cdf) of a random variable X, denoted F X, is F X (x) = P({ω Ω : X(ω) x}) (or shortly = P X (X x)).

Random variable Usually, in practice, ω is not observed directly and analysis is based on the observed random variable X(ω). Thus statistical analysis is based on the measure P X, not on P.

Definition, density function and probability mass function The probability density function (pdf) f X (x) of a continuous random variable X is the derivative of its cumulative distribution function f X (x) = d dx F X (x). (Note that the density function does not always exist.) In the case of a discrete random variable X, the analogue of a probability density function is a probability mass function (pmf) p X (x) = P(X = x), which corresponds to the probability of the event X = x.

Random variables are often defined by giving their cumulative distribution functions and/or density functions.

Examples discrete X: for example Binomial or Poisson distribution continuous X: for example uniform, normal or exponential distribution

Multivariate distributions Very often, instead of dealing with one random variable X only, we are interested in several random variables X 1,..., X k.

Joint cumulative distribution function Let X 1,..., X k be random variables. Then the joint cumulative distribution function of X 1,..., X k is given by F X1,...,X k (x 1,..., x k ) = P(X 1 x 1,..., X k x k ).

Joint probability density function Let X 1,..., X k be continuous random variables. Then the joint probability density function of X 1,..., X k (if it exists) is given by f X1,...,X k (x 1,..., x k ) = d n dx 1 dx k F X1,...,X k (x 1,..., x k ).

Joint probability mass function Let X 1,..., X k be discrete random variables. Then the joint probability mass function of X 1,..., X k is given by p X1,...,X k (x 1,..., x k ) = P(X 1 = x 1,..., X k = x k ).

Let X 1,..., X k be continuous random variables. Assume that the joint probability density function f X1,...,X k (x 1,..., x k ) exists. Then P((X 1,..., X k ) A) = f X1,...,X k (x 1,..., x k )dx 1...dx k. (x 1,...,x k ) A For discrete variables P((X 1,..., X n ) A) = c,c C,(X 1 (c),...,x n(c)) A p X1,...,X k (x 1,..., x k ). (In the formula above C is the sample space of the random event.)

Marginal distributions Let Z 1,..., Z h and Y 1,..., Y l be continuous random variables with joint probability density functions f Z1,...,Z h (z 1,..., z h ), f Y1,...,Y l (y 1,..., y l ) and f Z1,...,Z h,y 1,...,Y l (z 1,..., z h, y 1,..., y l ). Then f Z1,...,Z h (z 1,..., z h ) = For discrete variables p Z1,...,Z h (z 1,..., z h ) = f Z1,...,Z h,y 1,...,Y l (z 1,..., z h, y 1,..., y l )dy 1...dy l. y 1 <,...,y l < p Z1,...,Z h,y 1,...,Y l (z 1,..., z h, y 1,..., y l ).

Independence Let X 1,..., X n be continuous random variables with probability density functions f X1 (x 1 )... f Xn (x n ) and a joint probability density function f X1,...,X n (x 1,..., x n ). If f X1,...,X n (x 1,..., x n ) = f X1 (x 1 ) f Xn (x n ), the random variables X 1,..., X n are independent. Discrete random variables are independent, if p X1,...,X n (x 1,..., x n ) = p X1 (x 1 ) p Xn (x n ).

Example, independence Let X and Y have the joint pdf { x + y, 0 x 1, 0 y 1 f (x, y) = 0, otherwise. Are the variables X and Y independent? Now and f (x) = f (y) = 1 0 1 0 (x + y)dy = x + 1 2, 0 < x < 1 (x + y)dy = y + 1 2, 0 < y < 1. If the random variables are independent, then f (x, y) = f (x) f (y). Let x=1/3 and y=1/3. Now On the other hand, f (x, y) = x + y = 1/3 + 1/3 = 2/3. f (x) f (y) = (x + 1/2) (y + 1/2) = 5/6 5/6 = 25/36 2/3. Thus X and Y are not independent.

Example, independence Let X and Y have the joint pmf 1, x {1, 2}, y {1, 2} p(x, y) = 4. 0, otherwise Now p(x) = y {1,2} p(x, y) = 1/4 + 1/4 = 1 2, x {1, 2}, and otherwise p(x) = 0, and p(y) = x {1,2} p(x, y) = 1/4 + 1/4 = 1 2, y {1, 2}, and otherwise p(y) = 0.

If p(x, y) = p(x) p(y), then X and Y are independent. Now and p(x) p(y) = 1 2 1 2 = 1 = p(x, y), x {1, 2}, y {1, 2} 4 p(x) p(y) = 0 = p(x, y), otherwise. The random variables are independent!

Conditional distribution Let Z 1,..., Z n and Y 1,..., Y m continuous random variables with joint probability density functions f Z1,...,Z n (z 1,..., z n ), f Y1,...,Y m (y 1,..., y m ) and f Z1,...,Z n,y 1,...,Y m (z 1,..., z n, y 1,..., y m ). Then f Y1,...,Y m Z 1,...,Z n (y 1,..., y m z 1,..., z n ) = f Z 1,...,Z n,y 1,...,Y m (z 1,..., z n, y 1,..., y m ), f Z1,...,Z n (z 1,..., z n ) for f Z1,...,Z n (z 1,..., z n ) > 0. For discrete random variables p Y1,...,Y m Z 1,...,Z n (y 1,..., y m z 1,..., z n ) = p Z 1,...,Z n,y 1,...,Y m (z 1,..., z n, y 1,..., y m ), p Z1,...,Z n (z 1,..., z n ) for p Z1,...,Z n (z 1,..., z n ) > 0.

Definition, expected value Let X be a continuous random variable. If h(x) f X (x)dx <, then the expected value of a random variable h(x) is (the real number) E[h(X)] = h(x)f X (x)dx. Let X be a discrete random variable with the domain I. If x I h(x) p X (x) <, then the expected value of h(x) is E[h(X)] = x I h(x)p X (x).

Example The expected value of X, E[X], is obtained by setting h(x) = X. The variance of X, var[x], is obtained by setting h(x) = (X E[X]) 2. The kth moment of X, E[X k ], is obtained by setting h(x) = X k.

Numerical example, expected value Let X be a continuous random variable with the pdf { 1, 0 x 1 f X (x) = 0, otherwise. Now E[X] = x f X (x)dx = 1 0 x 1dx = 1 2. Let X be a discrete random variable with the pmf Now p X (x) = P(X = x) = 1 30 x 2, x = {1, 2, 3, 4}. E[X] = x p X (x) = 1 1 30 + 2 4 30 + 3 9 30 + 4 16 30 = 10 3.

Theorems, formulae for expectation and variance Let X 1,..., X n be random variables with finite expectations and variances. Let a, b R. Then E[ n i=1 X i] = n i=1 E[X i] E[aX i + b] = ae[x i ] + b var[ax i + b] = a 2 var[x i ] Let X 1,..., X n be independent. Then E[X 1 X 2 X n ] = E[X 1 ]E[X 2 ] E[X n ] var[ n i=1 X i] = n i=1 var[x i]

J. S. Milton, J. C. Arnold: Introduction to Probability and Statistics, McGraw-Hill Inc 1995. J. Crawshaw, J. Chambers: A Concise Course in Advanced Level Statistics, Nelson Thornes Ltd 2013. R. V. Hogg, J. W. McKean, A. T. Craig: Introduction to Mathematical Statistics, Pearson Education 2005. Pertti Laininen: Todennäköisyys ja sen tilastollinen soveltaminen, Otatieto 1998, numero 586. Ilkka Mellin: Tilastolliset menetelmät, http://math.aalto.fi/opetus/sovtoda/materiaali.html.