Review of Statistics I

Similar documents
Random Variables and Their Distributions

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

1 Review of Probability and Distributions

Joint Distribution of Two or More Random Variables

More on Distribution Function

Probability Theory and Statistics. Peter Jochumzen

1 Presessional Probability

Continuous random variables

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Lecture 2: Review of Probability

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Chapter 3: Random Variables 1

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

ECON Fundamentals of Probability

Topic 4: Continuous random variables

Lecture 2: Repetition of probability theory and statistics

Algorithms for Uncertainty Quantification

7 Random samples and sampling distributions

Topic 4: Continuous random variables

4th IIA-Penn State Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur

BASICS OF PROBABILITY

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

Mathematics 426 Robert Gross Homework 9 Answers

Probability and Distributions

Continuous Random Variables and Continuous Distributions

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Chapter 4. Chapter 4 sections

Chapter 3: Random Variables 1

Chapter 2. Continuous random variables

Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com

Part (A): Review of Probability [Statistics I revision]

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

p. 4-1 Random Variables

STAT Chapter 5 Continuous Distributions

Review 1: STAT Mark Carpenter, Ph.D. Professor of Statistics Department of Mathematics and Statistics. August 25, 2015

Brief Review of Probability

EE4601 Communication Systems

3rd IIA-Penn State Astrostatistics School July, 2010 Vainu Bappu Observatory, Kavalur

3. Probability and Statistics

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Mathematics. ( : Focus on free Education) (Chapter 16) (Probability) (Class XI) Exercise 16.2

Slides 8: Statistical Models in Simulation

Continuous Random Variables

CME 106: Review Probability theory

Random Variables. Statistics 110. Summer Copyright c 2006 by Mark E. Irwin

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Introduction to Probability Theory

Discrete Probability distribution Discrete Probability distribution

Math 105 Course Outline

Bayesian statistics, simulation and software

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Lecture 3: Random variables, distributions, and transformations

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

Statistical Methods in Particle Physics

Deep Learning for Computer Vision

Analysis of Experimental Designs

continuous random variables

Continuous random variables

Quick Tour of Basic Probability Theory and Linear Algebra

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

Conditional Probability

Theory of probability and mathematical statistics

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

1 Probability and Random Variables

ECON 5350 Class Notes Review of Probability and Distribution Theory

Week 1 Quantitative Analysis of Financial Markets Distributions A

Statistics, Data Analysis, and Simulation SS 2015

CSE 312, 2017 Winter, W.L. Ruzzo. 7. continuous random variables

III - MULTIVARIATE RANDOM VARIABLES

2 (Statistics) Random variables

Chapter 3. Chapter 3 sections

Sampling Distributions of Statistics Corresponds to Chapter 5 of Tamhane and Dunlop

Probability (10A) Young Won Lim 6/12/17

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

SOME SPECIFIC PROBABILITY DISTRIBUTIONS. 1 2πσ. 2 e 1 2 ( x µ

Discrete Random Variables

Examples of random experiment (a) Random experiment BASIC EXPERIMENT

SDS 321: Introduction to Probability and Statistics

STAT 512 sp 2018 Summary Sheet

6 The normal distribution, the central limit theorem and random samples

Continuous Random Variables

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

Review of Probability Theory

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Things to remember when learning probability distributions:

Introduction to Probability and Stocastic Processes - Part I

Lecture 1: August 28

Statistics 1B. Statistics 1B 1 (1 1)

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

THE QUEEN S UNIVERSITY OF BELFAST

Chapter 5. Chapter 5 sections

II. The Normal Distribution

Conditional Probability

To understand and analyze this test, we need to have the right model for the events. We need to identify an event and its probability.

Statistics, Data Analysis, and Simulation SS 2013

Probability, Random Processes and Inference

1 Review of Probability

Recitation 2: Probability

Transcription:

Review of Statistics I Hüseyin Taştan 1 1 Department of Economics Yildiz Technical University April 17, 2010 1

Review of Distribution Theory Random variables, discrete vs continuous Probability distribution functions Probability density function Expectation operator Normal distribution and related distributions: t-distribution, Chi-square distribution, F-distribution 2

Review of Distribution Theory Random variables, discrete vs continuous Probability distribution functions Probability density function Expectation operator Normal distribution and related distributions: t-distribution, Chi-square distribution, F-distribution 2

Review of Distribution Theory Random variables, discrete vs continuous Probability distribution functions Probability density function Expectation operator Normal distribution and related distributions: t-distribution, Chi-square distribution, F-distribution 2

Review of Distribution Theory Random variables, discrete vs continuous Probability distribution functions Probability density function Expectation operator Normal distribution and related distributions: t-distribution, Chi-square distribution, F-distribution 2

Review of Distribution Theory Random variables, discrete vs continuous Probability distribution functions Probability density function Expectation operator Normal distribution and related distributions: t-distribution, Chi-square distribution, F-distribution 2

Random Variables A random variable is a function from the sample space S to the real numbers. We may denote a random variable as X(S) which is a usual, measurable function The value of a random variable cannot be known before the realization of event - uncertainty- Notation: use capital letters X, Y, or Z or any other letter to denote random variables Specific values of random variable will be denoted using letters x, y or z Using this notation we may calculate P(X = x) for discrete random variables Two types of rv: discrete vs. continuous 3

Random Variables A random variable is a function from the sample space S to the real numbers. We may denote a random variable as X(S) which is a usual, measurable function The value of a random variable cannot be known before the realization of event - uncertainty- Notation: use capital letters X, Y, or Z or any other letter to denote random variables Specific values of random variable will be denoted using letters x, y or z Using this notation we may calculate P(X = x) for discrete random variables Two types of rv: discrete vs. continuous 3

Random Variables A random variable is a function from the sample space S to the real numbers. We may denote a random variable as X(S) which is a usual, measurable function The value of a random variable cannot be known before the realization of event - uncertainty- Notation: use capital letters X, Y, or Z or any other letter to denote random variables Specific values of random variable will be denoted using letters x, y or z Using this notation we may calculate P(X = x) for discrete random variables Two types of rv: discrete vs. continuous 3

Random Variables A random variable is a function from the sample space S to the real numbers. We may denote a random variable as X(S) which is a usual, measurable function The value of a random variable cannot be known before the realization of event - uncertainty- Notation: use capital letters X, Y, or Z or any other letter to denote random variables Specific values of random variable will be denoted using letters x, y or z Using this notation we may calculate P(X = x) for discrete random variables Two types of rv: discrete vs. continuous 3

Random Variables A random variable is a function from the sample space S to the real numbers. We may denote a random variable as X(S) which is a usual, measurable function The value of a random variable cannot be known before the realization of event - uncertainty- Notation: use capital letters X, Y, or Z or any other letter to denote random variables Specific values of random variable will be denoted using letters x, y or z Using this notation we may calculate P(X = x) for discrete random variables Two types of rv: discrete vs. continuous 3

Random Variables A random variable is a function from the sample space S to the real numbers. We may denote a random variable as X(S) which is a usual, measurable function The value of a random variable cannot be known before the realization of event - uncertainty- Notation: use capital letters X, Y, or Z or any other letter to denote random variables Specific values of random variable will be denoted using letters x, y or z Using this notation we may calculate P(X = x) for discrete random variables Two types of rv: discrete vs. continuous 3

Discrete Random Variable Can take a countable number of distinct possible values Whenever all possible values a random variable can assume can be listed (or counted) the random variable is discrete Example: the sum of two numbers on the top when two dices are rolled: x = 2,3,...,12 Example: the number of sales made by a salesperson per week: x = 0,1,2,... Example: the number of errors made by an accountant 4

Discrete Random Variable Can take a countable number of distinct possible values Whenever all possible values a random variable can assume can be listed (or counted) the random variable is discrete Example: the sum of two numbers on the top when two dices are rolled: x = 2,3,...,12 Example: the number of sales made by a salesperson per week: x = 0,1,2,... Example: the number of errors made by an accountant 4

Discrete Random Variable Can take a countable number of distinct possible values Whenever all possible values a random variable can assume can be listed (or counted) the random variable is discrete Example: the sum of two numbers on the top when two dices are rolled: x = 2,3,...,12 Example: the number of sales made by a salesperson per week: x = 0,1,2,... Example: the number of errors made by an accountant 4

Discrete Random Variable Can take a countable number of distinct possible values Whenever all possible values a random variable can assume can be listed (or counted) the random variable is discrete Example: the sum of two numbers on the top when two dices are rolled: x = 2,3,...,12 Example: the number of sales made by a salesperson per week: x = 0,1,2,... Example: the number of errors made by an accountant 4

Discrete Random Variable Can take a countable number of distinct possible values Whenever all possible values a random variable can assume can be listed (or counted) the random variable is discrete Example: the sum of two numbers on the top when two dices are rolled: x = 2,3,...,12 Example: the number of sales made by a salesperson per week: x = 0,1,2,... Example: the number of errors made by an accountant 4

Continuous Random Variable Can take uncountable number of values within an interval on the real line Many variables from business and economics belong to this category Example: Average disposable income in a city Example: Closing value of a stock exchange index Example: Inflation rate for a given month 5

Continuous Random Variable Can take uncountable number of values within an interval on the real line Many variables from business and economics belong to this category Example: Average disposable income in a city Example: Closing value of a stock exchange index Example: Inflation rate for a given month 5

Continuous Random Variable Can take uncountable number of values within an interval on the real line Many variables from business and economics belong to this category Example: Average disposable income in a city Example: Closing value of a stock exchange index Example: Inflation rate for a given month 5

Continuous Random Variable Can take uncountable number of values within an interval on the real line Many variables from business and economics belong to this category Example: Average disposable income in a city Example: Closing value of a stock exchange index Example: Inflation rate for a given month 5

Continuous Random Variable Can take uncountable number of values within an interval on the real line Many variables from business and economics belong to this category Example: Average disposable income in a city Example: Closing value of a stock exchange index Example: Inflation rate for a given month 5

Probability Distributions for Discrete R.V. Probability Distribution Function f(x) 0 f(x) = P(X = x) f(x) = 1 x Cumulative Distribution Function P(X x 0 ) = F(x 0 ) = x x 0 f(x) The probability distribution of a discrete random variable is a formula, graph or table that specifies the probability associated with each possible value the random variable can assume 6

Example Three coins are tossed. Let X denote the number of Heads Sample space contains 8 points: (HHH), (HHT), (HTH), (THH), (HTT), (THT), (TTH), ve (TTT) These 8 events are mutually exclusive and each has the same probability: 1/8 X can take four values: 0, 1, 2, 3 7

Example Three coins are tossed. Let X denote the number of Heads Sample space contains 8 points: (HHH), (HHT), (HTH), (THH), (HTT), (THT), (TTH), ve (TTT) These 8 events are mutually exclusive and each has the same probability: 1/8 X can take four values: 0, 1, 2, 3 7

Example Three coins are tossed. Let X denote the number of Heads Sample space contains 8 points: (HHH), (HHT), (HTH), (THH), (HTT), (THT), (TTH), ve (TTT) These 8 events are mutually exclusive and each has the same probability: 1/8 X can take four values: 0, 1, 2, 3 7

Example Three coins are tossed. Let X denote the number of Heads Sample space contains 8 points: (HHH), (HHT), (HTH), (THH), (HTT), (THT), (TTH), ve (TTT) These 8 events are mutually exclusive and each has the same probability: 1/8 X can take four values: 0, 1, 2, 3 7

Example continued Distribution of X Results x f(x) TTT 0 1/8 TTH 1 THT 1 3/8 HTT 1 THH 2 HTH 2 3/8 HHT 2 HHH 3 1/8 8

Example continued Distribution Function of X x 0 1 2 3 f(x) = P(X = x) 1 8 3 8 3 8 1 8 Note that f(x) = 1 x Find P(X 1) =? P(1 X 3) =? 9

Example continued Cumulative Distribution Function of X F(x) = P(X x) = 0, x < 0; 1 8, 0 x < 1; 1 2, 1 x < 2; 7 8, 2 x < 3; 1, x 3. 10

Example continued Distribution Function of X 0.4 OLASILIK FONKSIYONU 0.35 0.3 0.25 f(x) 0.2 0.15 0.1 0.05 0 1 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1 BIRIKIMLI OLASILIK FONKSIYONU 0.8 0.6 F(x) 0.4 0.2 0 1 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 11

Expectations of Discrete R.V. Expected Value of discrete r.v. X E(X) = x xf(x) If g(x) is a function of X then the expected value of g(x) is E(g(X)) = x g(x)f(x) If g(x) = x 2 then E(g(X)) = E(X 2 ) = x x 2 f(x) 12

Expectations of Discrete R.V. Expected Value of discrete r.v. X E(X) = x xf(x) If g(x) is a function of X then the expected value of g(x) is E(g(X)) = x g(x)f(x) If g(x) = x 2 then E(g(X)) = E(X 2 ) = x x 2 f(x) 12

Expectations of Discrete R.V. Expected Value of discrete r.v. X E(X) = x xf(x) If g(x) is a function of X then the expected value of g(x) is E(g(X)) = x g(x)f(x) If g(x) = x 2 then E(g(X)) = E(X 2 ) = x x 2 f(x) 12

Example Let s find the expected value of X in the previous example E(X) = 0 1 8 +13 8 +23 8 +31 8 = 3 2 Find the expected value of g(x) = x 2 E(X 2 ) = 0 1 8 +13 8 +43 8 +91 8 = 3 Find the expected value of g(x) = 2x+3x 2 13

Example Let s find the expected value of X in the previous example E(X) = 0 1 8 +13 8 +23 8 +31 8 = 3 2 Find the expected value of g(x) = x 2 E(X 2 ) = 0 1 8 +13 8 +43 8 +91 8 = 3 Find the expected value of g(x) = 2x+3x 2 13

Example Let s find the expected value of X in the previous example E(X) = 0 1 8 +13 8 +23 8 +31 8 = 3 2 Find the expected value of g(x) = x 2 E(X 2 ) = 0 1 8 +13 8 +43 8 +91 8 = 3 Find the expected value of g(x) = 2x+3x 2 13

Variance of a Discrete RV Definition: Var(X) = E [(X E(X)) 2] = E [ (X 2 2XE(X) +(E(X)) 2 ) ] = E ( X 2) 2E(XE(X)) +E ( (E(X)) 2 ) ) = E ( X 2) 2E(X) 2 +(E(X)) 2 = E ( X 2) E(X) 2 Let µ x = E(X) then the variance can be written as Var(X) = E ( X 2) µ 2 x 14

Variance of a Discrete RV Definition: Var(X) = E [(X E(X)) 2] = E [ (X 2 2XE(X) +(E(X)) 2 ) ] = E ( X 2) 2E(XE(X)) +E ( (E(X)) 2 ) ) = E ( X 2) 2E(X) 2 +(E(X)) 2 = E ( X 2) E(X) 2 Let µ x = E(X) then the variance can be written as Var(X) = E ( X 2) µ 2 x 14

Moments of a Discrete RV Definiton: kth moment of a discrete rv is defined as µ k = E(X k ) = x x k f(x) k = 0,1,2,... 1. moment µ 1 = E(X) = population mean 2. moment µ 2 = E(X 2 ) = Var(X)+µ 2 1 3. moment µ 3 = E(X 3 ) 4. moment µ 4 = E(X 4 ) 15

Central Moments of a Discrete RV Definiton: kth central moment of a discrete rv is defined as m k = E((X µ 1 ) k ) = x (x µ 1 ) k f(x) k = 0,1,2,... 1. central moment m 1 = 0 2. central moment m 2 = E((X µ 1 ) 2 ) = Var(X) 3. central moment m 3 = E((X µ 1 ) 3 ) 4. central moment m 4 = E((X µ 1 ) 4 ) 16

Standard Moments of a Discrete RV Definiton: kth standard moment of a discrete rv is defined as γ k = m k σ k k = 0,1,2,... σ is the population standard deviation: σ = Var(X) = E [(X µ 1 ) 2] 1. standart moment γ 1 = 0 2. standart moment γ 2 = 1 why? 3. standart moment γ 3 = m 3 skewness σ 3 4. standart moment γ 4 = m 4 kurtosis σ 4 17

Some Discrete Distributions Bernoulli Binomial Hypergeometric Poisson 18

Some Discrete Distributions Bernoulli Binomial Hypergeometric Poisson 18

Some Discrete Distributions Bernoulli Binomial Hypergeometric Poisson 18

Some Discrete Distributions Bernoulli Binomial Hypergeometric Poisson 18

Bernoulli(p) Distribution Expected Value: f(x) = { p, if X = 1 1 p, if X = 0 E(X) = x xf(x) = p 1+(1 p) 0 = p Second Moment: E(X 2 ) = x x 2 f(x) = p 1+(1 p) 0 = p Variance (second central moment): Var(X) = E(X 2 ) (E(X)) 2 = p p 2 = p(1 p) 19

Binomial Distribution Let X be the number of successes in an n independent trials of Bernoulli random experiment. If Y is distributed as Bernoulli(p) then X = Y has a Binomial(n,p) distribution. X denotes total number of successes. f(x) = ( n x ) p x (1 p) n x = n! x!(n x)! px (1 p) n x, x = 0,1,2,...,n E(X) = np Var(X) = np(1 p) 20

Hypergeometric Distribution If Bernoulli trials are not independent then the total number of successes does not have a Binomial distribution but follows hypergeometric distribution. In an N-element population containing B successes total number of successes X in a random sample n has the following distribution f(x) = ( B x )( ) N B n x ( ) N n Here x can take integer values between max(0,n (N B)) and min(n, B) E(X) = np, Var(X) = N n N 1 np(1 p), p = B N 21

Poisson Distribution Describes the distribution of the number of realizations of a certain event in a given period of time Notation: X P oisson(λ) Probability DF: f(x,λ) = λx e λ, x = 0,1,2,... x! E(X) = λ Var(X) = λ skewness = 1 λ excess kurtosis = 1 λ 22

Joint Distributions of Discrete RV Let X and Y be two discrete rv. Then the joint pd is f(x,y) = P(X = x Y = y) More generally, joint pd of k discrete rv denoted X 1,X 2,...,X k is f(x 1,x 2,...,x k ) = P(X 1 = x 1 X 2 = x 2,,..., X k = x k ) 23

Example X: number of customers waiting in line for Counter 1 in a bank, Y: number of customers waiting in line for Counter 2 in a bank. Joint pd for these two rv is given by Joint probability distribution of X and Y y \x 0 1 2 3 Total 0 0.05 0.21 0 0 0.26 1 0.20 0.26 0.08 0 0.54 2 0 0.06 0.07 0.02 0.15 3 0 0 0.03 0.02 0.05 Total 0.25 0.53 0.18 0.04 1.00 24

Marginal Distribution Function Marginal DF for X: f(x) = y f(x,y) Marginal DF Y: f(y) = x f(x,y) 25

Conditional Distribution Function Conditional DF for X given Y = y: f(x y) = f(x,y) f(y) Conditional DF for Y given X = x: f(y x) = f(x,y) f(x) 26

Statistical Independence Discrete rvs X and Y are said to be independent if and only if f(x,y) = f(x)f(y) or, in other words and f(x y) = f(x), f(y x) = f(y) 27

Covariance between two discrete rv Let g(x,y) be a function of X and Y. The expected value of this function is: E[g(X,Y)] = g(x,y)f(x,y) x y Now let g(x,y) = (X µ x )(Y µ y ). The expected value of this function is called covariance: Cov(X,Y) = (x µ x )(y µ y )f(x,y) x y It measures the linear association between two random variables. Statistical independence implies that the covariance is zero but the reverse is not true. 28

Probability Distributions for Continuous Random Variables A continuous rv can assume any value within some interval. Probability distribution for a continuous rv is a smooth function f(x) called probability density function (pdf). The probability associated with a certain value of X is 0. Properties of pdf f(x): f(x) 0 f(x)dx = 1 Pr(a < X < b) = b a f(x)dx 29

Probability Distributions for Continuous Random Variables A continuous rv can assume any value within some interval. Probability distribution for a continuous rv is a smooth function f(x) called probability density function (pdf). The probability associated with a certain value of X is 0. Properties of pdf f(x): f(x) 0 f(x)dx = 1 Pr(a < X < b) = b a f(x)dx 29

Probability Distributions for Continuous Random Variables A continuous rv can assume any value within some interval. Probability distribution for a continuous rv is a smooth function f(x) called probability density function (pdf). The probability associated with a certain value of X is 0. Properties of pdf f(x): f(x) 0 f(x)dx = 1 Pr(a < X < b) = b a f(x)dx 29

Probability Distributions for Continuous Random Variables A continuous rv can assume any value within some interval. Probability distribution for a continuous rv is a smooth function f(x) called probability density function (pdf). The probability associated with a certain value of X is 0. Properties of pdf f(x): f(x) 0 f(x)dx = 1 Pr(a < X < b) = b a f(x)dx 29

Probability Distributions for Continuous Random Variables A continuous rv can assume any value within some interval. Probability distribution for a continuous rv is a smooth function f(x) called probability density function (pdf). The probability associated with a certain value of X is 0. Properties of pdf f(x): f(x) 0 f(x)dx = 1 Pr(a < X < b) = b a f(x)dx 29

Probability Distributions for Continuous Random Variables A continuous rv can assume any value within some interval. Probability distribution for a continuous rv is a smooth function f(x) called probability density function (pdf). The probability associated with a certain value of X is 0. Properties of pdf f(x): f(x) 0 f(x)dx = 1 Pr(a < X < b) = b a f(x)dx 29

A Probability Density Function f(x) P(a<X<b) = b a f(x)dx 0 a b x 30

Cumulative Density Function Cumulative density function (cdf), denoted F(x), shows the probability that a random variable does not exceed a given value x Definition: F(x) = P(X x) = x f(t)dt From this definition cdf and pdf are related with f(x) = df(x) dx 31

Cumulative Density Function Cumulative density function (cdf), denoted F(x), shows the probability that a random variable does not exceed a given value x Definition: F(x) = P(X x) = x f(t)dt From this definition cdf and pdf are related with f(x) = df(x) dx 31

Cumulative Density Function Cumulative density function (cdf), denoted F(x), shows the probability that a random variable does not exceed a given value x Definition: F(x) = P(X x) = x f(t)dt From this definition cdf and pdf are related with f(x) = df(x) dx 31

Properties of CDF F( ) = 0, F(+ ) = 1 F(x) is a nondecreasing function of x: x 1 x 2, F(x 1 ) F(x 2 ). P(a < X < b) = F(b) F(a) = b a f(x)dx P( < X < + ) = P( < X < a)+p(a < X < b)+p(b < X < + ) a f(x)dx = f(x)dx + b a f(x)dx + + b f(x)dx F(+ ) F( ) = [F(a) F( )]+P(a < X < b)+[f(+ ) F(b)] 1 = F(a) 0+P(a < X < b)+1 F(b) P(a < X < b) = F(b) F(a) 32

Properties of CDF F( ) = 0, F(+ ) = 1 F(x) is a nondecreasing function of x: x 1 x 2, F(x 1 ) F(x 2 ). P(a < X < b) = F(b) F(a) = b a f(x)dx P( < X < + ) = P( < X < a)+p(a < X < b)+p(b < X < + ) a f(x)dx = f(x)dx + b a f(x)dx + + b f(x)dx F(+ ) F( ) = [F(a) F( )]+P(a < X < b)+[f(+ ) F(b)] 1 = F(a) 0+P(a < X < b)+1 F(b) P(a < X < b) = F(b) F(a) 32

Properties of CDF F( ) = 0, F(+ ) = 1 F(x) is a nondecreasing function of x: x 1 x 2, F(x 1 ) F(x 2 ). P(a < X < b) = F(b) F(a) = b a f(x)dx P( < X < + ) = P( < X < a)+p(a < X < b)+p(b < X < + ) a f(x)dx = f(x)dx + b a f(x)dx + + b f(x)dx F(+ ) F( ) = [F(a) F( )]+P(a < X < b)+[f(+ ) F(b)] 1 = F(a) 0+P(a < X < b)+1 F(b) P(a < X < b) = F(b) F(a) 32

Properties of CDF F( ) = 0, F(+ ) = 1 F(x) is a nondecreasing function of x: x 1 x 2, F(x 1 ) F(x 2 ). P(a < X < b) = F(b) F(a) = b a f(x)dx P( < X < + ) = P( < X < a)+p(a < X < b)+p(b < X < + ) a f(x)dx = f(x)dx + b a f(x)dx + + b f(x)dx F(+ ) F( ) = [F(a) F( )]+P(a < X < b)+[f(+ ) F(b)] 1 = F(a) 0+P(a < X < b)+1 F(b) P(a < X < b) = F(b) F(a) 32

CDF f(x) F(+ ) F(b) = 1 F(b) F(a) F( ) = F(a) b a f(x)dx = F(b) F(a) 0 a b x 33

Expectations of Continuous Random Variables E(X) µ x = If g(x) is a continuous function of X, then E(g(X)) = Var(X) = xf(x)dx g(x)f(x)dx (x µ x ) 2 f(x)dx 34

Variance of Continuous RV Var(X) = E [ (X E(X)) 2] = = = E(X 2 ) µ 2 x x 2 f(x)dx+µ 2 x ( x 2 f(x)dx Note that we used f(x)dx = 1 and xf(x)dx = E(X) µ x (x µ x ) 2 f(x)dx f(x)dx 2µ x xf(x)dx xf(x)dx ) 2 35

Moments of Continuous RV Definition: kth moment of a continuous rv X is given by µ k = E(X k ) = x k f(x)dx k = 0,1,2,... x X 1. moment µ 1 = E(X) = population mean 2. moment µ 2 = E(X 2 ) = Var(X)+µ 2 1 3. moment µ 3 = E(X 3 ) 4. moment µ 4 = E(X 4 ) 36

Central Moments of Continuous RV Definition: kth central moment of X is m k = E((X µ 1 ) k ) = (x µ 1 ) k f(x)dx k = 0,1,2,... x X 1. central moment m 1 = 0 2. central moment m 2 = E((X µ 1 ) 2 ) = Var(X) 3. central moment m 3 = E((X µ 1 ) 3 ) 4. central moment m 4 = E((X µ 1 ) 4 ) 37

Properties of Expectation Operator Linearity: Let Y = a+bx then the expected value of Y is: E[Y] = E[a+bX] = a+be(x) More generally let X 1,X 2,...,X n be n continuous rvs and let Y be a linear combination of these rvs: Y = b 1 X 1 +b n X 2 +...+b n X n Expected value of Y is given by: E[Y] = b 1 E[X 1 ]+b 2 E[X 2 ]+...+b n E[X n ] or more compactly ( n ) E(Y) = E b i X i = i=1 n b i E(X i ) i=1 38

Properties of Expectation Operator Linearity: Let Y = a+bx then the expected value of Y is: E[Y] = E[a+bX] = a+be(x) More generally let X 1,X 2,...,X n be n continuous rvs and let Y be a linear combination of these rvs: Y = b 1 X 1 +b n X 2 +...+b n X n Expected value of Y is given by: E[Y] = b 1 E[X 1 ]+b 2 E[X 2 ]+...+b n E[X n ] or more compactly ( n ) E(Y) = E b i X i = i=1 n b i E(X i ) i=1 38

Properties of Expectation Operator For a nonlinear function of X we have (in general) E[h(X)] h(e(x)) For example, E(X 2 ) (E(X)) 2, E(ln(X)) ln(e(x)) For two continuous rvs X and Y ( ) X E E(X) Y E(Y) 39

Properties of Expectation Operator For a nonlinear function of X we have (in general) E[h(X)] h(e(x)) For example, E(X 2 ) (E(X)) 2, E(ln(X)) ln(e(x)) For two continuous rvs X and Y ( ) X E E(X) Y E(Y) 39

Properties of Expectation Operator For a nonlinear function of X we have (in general) E[h(X)] h(e(x)) For example, E(X 2 ) (E(X)) 2, E(ln(X)) ln(e(x)) For two continuous rvs X and Y ( ) X E E(X) Y E(Y) 39

Properties of Variance Let c be any constant then Var(c) = 0 Variance of Y = bx where b is a constant Var(Y) = Var(bX) = b 2 Var(X) Variance of Y = a+bx Var(Y) = Var(a+bX) = b 2 Var(X) For two independent continuous rvs X and Y Var(X +Y) = Var(X)+Var(Y) Var(X Y) = Var(X)+Var(Y) This can easily be generalized to n independent rvs. 40

Properties of Variance Let c be any constant then Var(c) = 0 Variance of Y = bx where b is a constant Var(Y) = Var(bX) = b 2 Var(X) Variance of Y = a+bx Var(Y) = Var(a+bX) = b 2 Var(X) For two independent continuous rvs X and Y Var(X +Y) = Var(X)+Var(Y) Var(X Y) = Var(X)+Var(Y) This can easily be generalized to n independent rvs. 40

Properties of Variance Let c be any constant then Var(c) = 0 Variance of Y = bx where b is a constant Var(Y) = Var(bX) = b 2 Var(X) Variance of Y = a+bx Var(Y) = Var(a+bX) = b 2 Var(X) For two independent continuous rvs X and Y Var(X +Y) = Var(X)+Var(Y) Var(X Y) = Var(X)+Var(Y) This can easily be generalized to n independent rvs. 40

Properties of Variance Let c be any constant then Var(c) = 0 Variance of Y = bx where b is a constant Var(Y) = Var(bX) = b 2 Var(X) Variance of Y = a+bx Var(Y) = Var(a+bX) = b 2 Var(X) For two independent continuous rvs X and Y Var(X +Y) = Var(X)+Var(Y) Var(X Y) = Var(X)+Var(Y) This can easily be generalized to n independent rvs. 40

Continuous Standard Uniform Distribution Notation: X U(0,1), pdf: { 1, if 0 < x < 1, f(x) = 0, otherwise. E(X) = 1 2 Var(X) = 1 12 41

General Uniform Distribution Notation: X U(a,b), pdf: f(x;a,b) = { 1 b a if a < x < b, 0, otherwise. E(X) = b+a 2 Median = b+a 2 Var(X) = (b a)2 12 Skewness = 0 Excess kurtosis = 6 5 42

General Uniform Distribution Expectation of X U(a,b): b x E(X) = a b a dx [ 1 b 2 a 2 ] = b a 2 = (b a)(b+a) 2(b a) = a+b 2 43

General Uniform Distribution X U(a,b) Expected value of g(x) = x 2 b E[g(x)] = x 2 1 a b a = b3 a 3 3(b a) = (b a)(b2 +ab+a 2 ) 3(b a) = a2 +ab+b 2 = E[X 2 ]. 3 44

General Uniform Distribution Variance of X U(a,b): Var(X) = E[(X E(X)) 2 ] = E(X 2 ) [E(X)] 2 = (a2 +ab+b 2 ) 3 = (b a)2 12 (a+b)2 4 45

General Uniform Distribution Cumulative density function (cdf) of U (a,b) : F(x) = P(X x) x 1 = a b a dt t x = b a a = x a b a, for a x b 0, for x < a; x a F(x) = b a, for a x b; 1, for x > b. 46

Uniform Distribution f(x) F(x) 1 b a 1 0 a b x 0 a b x 47

Example { e f(x) = x, 0 < x < ise; 0, deilse. 1 Show that this function is a pdf. 2 Draw the graph of this function and mark the area for X > 1. 3 Calculate probability: P(X > 1). 4 Find cdf. 48

Example { e f(x) = x, 0 < x < ise; 0, deilse. 1 Show that this function is a pdf. 2 Draw the graph of this function and mark the area for X > 1. 3 Calculate probability: P(X > 1). 4 Find cdf. 48

Example { e f(x) = x, 0 < x < ise; 0, deilse. 1 Show that this function is a pdf. 2 Draw the graph of this function and mark the area for X > 1. 3 Calculate probability: P(X > 1). 4 Find cdf. 48

Example { e f(x) = x, 0 < x < ise; 0, deilse. 1 Show that this function is a pdf. 2 Draw the graph of this function and mark the area for X > 1. 3 Calculate probability: P(X > 1). 4 Find cdf. 48

Example continued 1 Let us see if the conditions for pdf are satisfied: 1 (i) First of all, f(x) 0 condition is satisfied for all values within the interval 0 < x <. 2 (ii) Should integrate to 1: e = lim x e x = 0 0 e x dx = 1 e x 0 = 1 e ( e 0 ) = 1 0+1 = 1 49

Example continued 1 Let us see if the conditions for pdf are satisfied: 1 (i) First of all, f(x) 0 condition is satisfied for all values within the interval 0 < x <. 2 (ii) Should integrate to 1: e = lim x e x = 0 0 e x dx = 1 e x 0 = 1 e ( e 0 ) = 1 0+1 = 1 49

Example continued 1 Let us see if the conditions for pdf are satisfied: 1 (i) First of all, f(x) 0 condition is satisfied for all values within the interval 0 < x <. 2 (ii) Should integrate to 1: e = lim x e x = 0 0 e x dx = 1 e x 0 = 1 e ( e 0 ) = 1 0+1 = 1 49

Example continued 1 P(X > 1) area shown on the figure. 2 P(X > 1) = 1 e x dx = e x 1 = e 1 0.36787 3 F(x) = x 0 e t dt = e t x 0 = e x +e 0 = 1 e x 50

Example continued 1 P(X > 1) area shown on the figure. 2 P(X > 1) = 1 e x dx = e x 1 = e 1 0.36787 3 F(x) = x 0 e t dt = e t x 0 = e x +e 0 = 1 e x 50

Example continued 1 P(X > 1) area shown on the figure. 2 P(X > 1) = 1 e x dx = e x 1 = e 1 0.36787 3 F(x) = x 0 e t dt = e t x 0 = e x +e 0 = 1 e x 50

Example continued CDF is F(x) = { 0, x < 0; 1 e x, 0 < x <. f ( 1x ) oyf: f( x) =e x F ( 1 x ) bof: f( x) =1 e x 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.5 P( X > 1) = 1 e x dx 0.6 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 1 2 3 4 5 x 0 0 1 2 3 4 5 x 51

Joint Probability Density Function Let X and Y be two continuous random variables defined within the intervals < X < + and < Y < +. The joint pdf for X and Y, denoted f(x,y) is defined as f(x,y) 0, Pr(a < X < b,c < Y < d) = f(x,y)dxdy = 1, d b c a f(x, y)dxdy. 52

Joint Probability Density Function: Example Graph of joint density f(x,y) = xye (x2 +y 2), x > 0, y > 0, f(x,y) 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 3 2.5 2 2.5 3 1.5 2 y 1 0.5 0 0 0.5 1 x 1.5 53

Example { k(x+y), 0 < x < 1, 0 < y < 2 ise; f(x,y) = 0, deilse. Find the constant k Find P ( 0 < X < 1 2, 1 < Y < 2). 54

Example continued First for f(x,y) > 0 we must have k > 0. From the second condition we have 2 1 0 0 k(x+y)dxdy = 1 2 ( ) 1 = k 0 2 +y dy = k = 3k = 1 ( ) 1 2 y + y2 2 2 0 Thus k = 1 3. The joint pdf is { 1 f(x,y) = 3 (x+y), 0 < x < 1, 0 < y < 2 ise; 0, deilse. 55

Example continued Probability is found as a volume measure: P (0 < X < 12 ), 1 < Y < 2 = = 1 3 = 1 3 2 = 7 24 1 2 1 0 2 1 1 3 (x+y)dxdy ( 1 8 + 1 ) 2 y dy ) 2 ( 1 8 y + y2 4 1 56

Marginal Density Function Definition: marginal pdf of X: f(x) = f(x,y)dy Bounds on the interval is the interval of Y. Marginal pdf of Y: f(y) = f(x,y)dx Bounds on the interval is the interval of X. 57

Example Using the joint pdf from the previous example find the marginal pdfs of X and Y. 2 1 f(x) = 0 3 (x+y)dy = 1 ) (xy + y2 2 3 2 = 2 3 (x+1) 0 58

Example continued Marginal pdf of X : Marginal pdf of Y: f(x) = g(y) = { 2 3 (x+1), 0 < x < 1 ; 0, otherwise. { 1 3 (y + 1 2 ), 0 < y < 2 ; 0, otherwise. 59

Conditional Density Function Given Y = y the conditional density of X is defined as : f(x y) = f(x,y) f(y) Similarly given X = x the conditional density of Y is defined as f(y x) = f(x,y) f(x) 60

Independence Recall that two events A and B are independent if only if: P(A B) = P(A)P(B) Similarly X and Y are independent if and only if the following condition is satisfied: f(x,y) = f(x)f(y) In other words if the joint density function can be written as the product of marginal densities then the two random variables are statistically independent. 61

Independence More generally let X 1,X 2,...,X n be n continuous random variables. If the joint density can be written as the product of marginal densities then these random variables are independent: f(x 1,x 2,...,x n ) = f 1 (x 1 ) f 2 (x 2 ),..., f n (x n ) n = f j (x j ) j=1 This property is useful for obtaining Maximum Likelihood estimators for certain parameters. 62

Independence: Example Determine if the random variables are independent in the previous example. f(x)g(x) = 2 3 (x+1)1 3 (y + 1 2 ) f(x,y) Thus X and Y are not independent. 63

Independence: Another Example Determine if X and Y are independent using the following joint pdf: { 1 f(x,y) = 9, for 1 < x < 4, 1 < y < 4; 0, otherwise. Marginal densities are Thus f(x) = g(y) = 4 1 4 1 1 9 dy = 1 3 1 9 dx = 1 3 f(x,y) = 1 9 = f(x)g(y) = ( 1 3 Therefore X and Y are statistically independent. )( ) 1 3 64

Normal Distribution Notation: X N(µ,σ 2 ). pdf is given by ( 1 2σ 2(x µ)2 f(x;µ,σ 2 ) = 1 σ 2π exp E(X) = µ Var(X) = σ 2 skewness = 0 kurtosis = 3 ), < x < 65

Normal Distribution, σ 2 = 1 Different location parameters 0.4 Normal Dagilim, σ 2 =1 0.35 µ = 2 µ = 0 µ = 2 µ = 5 0.3 0.25 µ = 5 φ(x) 0.2 0.15 0.1 0.05 0 10 8 6 4 2 0 2 4 6 8 10 x 66

Normal Distribution, µ = o Different scale parameters 0.4 Normal Dagilim, µ=0 0.35 0.3 σ 2 = 1, µ = 0 0.25 φ(x) 0.2 σ 2 = 2, µ = 0 0.15 0.1 σ 2 = 3, µ = 0 0.05 0 10 8 6 4 2 0 2 4 6 8 10 x 67

Standard Normal Distribution Define Z = X µ σ, pdf is given by φ(z) = 1 2π exp ( 12 z2 ), < z < E(Z) = 0 Var(Z) = 1 CDF is: Φ(z) = P(Z z) = z 1 exp ( 12 ) t2 dt 2π 68

Standard Normal Distribution 0.4 STANDART NORMAL DAGILIM φ(z) 1 STANDART NORMAL DAGILIM Φ(z) 0.35 0.9 0.3 0.8 0.7 0.25 0.6 0.2 0.5 0.15 0.4 0.1 0.3 0.2 0.05 0.1 0 4 3 2 1 0 1 2 3 4 0 4 3 2 1 0 1 2 3 4 69

Calculating Normal Distribution Probabilities Let X N(µ,σ 2 ). We want to calculate the following probability: P(a < X < b) = b a ( 1 σ 2π exp 1 ) 2σ 2(x µ)2 dx There is no closed-form solution to this integral. It can only be calculated using numerical methods. This requires using computational methods each time we need to evaluate a probability. Instead of this cumbersome method we can use normal distribution tables. Let us write the desired probability as follows: ( a µ P σ < X µ σ < b µ σ ) ( a µ = P σ < Z < b µ σ ) 70

Calculating Normal Distribution Probabilities ( a µ P σ < Z < b µ ) = Φ σ ( ) ( ) b µ a µ Φ σ σ Here Φ(z) = P(Z z) is the value of standard normal cdf at z. This can easily found by using standard normal tables. 71

Calculating Normal Distribution Probabilities Since the standard normal distribution is symmetric only positive values are listed in the tables. Using the symmetry property one can find the probabilities involving negative values by: e.g.: Φ( z) = P(Z z) = P(Z z) = 1 P(Z z) = 1 Φ(z) P(Z 1.25) = Φ( 1.25) = 1 Φ(1.25) = 1 0.8944 = 0.1056 72

Calculating Normal Distribution Probabilities 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 4 3 2 1 0 1 2 3 4 73

Calculating Normal Distribution Probabilities 0.4 0.35 0.3 P( 1.25<Z<1.25) = Φ(1.25) Φ( 1.25) = Φ(1.25) (1 Φ(1.25)) = 0.8944 0.1056 = 0.7888 0.25 0.2 0.15 0.1 Φ( 1.25) = 0.1056 1 Φ(1.25) = 0.1056 0.05 0 4 3 2 1 0 1 2 3 4 74

CENTRAL LIMIT THEOREM - CLT Let X 1,X 2,...,X n be a random sample of n iid random variables each having the same mean µ and variance σ 2. More compactly: X i i.i.d (µ,σ 2 ), i = 1,2,...,n iid means: identically and independently distributed Note that we do not mention the name of their distribution. Expected value and variance of the sum of these n iid variables will be: E[X 1 +X 2 +...+X n ] = E[X 1 ]+E[X 2 ]+...+E[X n ] = nµ Var[X 1 +X 2 +...+X n ] = Var[X 1 ]+Var[X 2 ]+...+Var[X n ] = nσ 2 75

CENTRAL LIMIT THEOREM - CLT Let X denote the sum of these rvs: X = X 1 +X 2 +...+X n, then Z = X E(X) Var(X) = X nµ nσ 2 = X n µ n 1/2 n σ = X µ σ/ n N(0,1) According to CLT as n, the expression above converges to standard normal distribution: Z N(0,1) 76

CENTRAL LIMIT THEOREM - CLT n= 1 n= 2 2000 4000 4000 n= 3 1500 1000 500 0 0 0.2 0.4 0.6 0.8 1 3000 2000 1000 0 0 0.2 0.4 0.6 0.8 1 3000 2000 1000 0 0 0.2 0.4 0.6 0.8 1 6000 n= 10 5000 n= 30 6000 n= 50 4000 4000 3000 4000 2000 2000 1000 2000 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.3 0.4 0.5 0.6 0.7 0 0.35 0.4 0.45 0.5 0.55 0.6 0.65 6000 n= 75 6000 n= 100 6000 n= 1000 4000 4000 4000 2000 2000 2000 0 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0 0.44 0.46 0.48 0.5 0.52 0.54 0.56 77

Law of Large Numbers - LLN According to LLN, the sample mean of n iid random variables converges to the population mean as the sample size n increases. Let X n = 1 n (X 1 +X 2 +...+X n ) be the sample mean then the LLN says that n, X n µ In other words, for a positive number ɛ that we can arbitrarily choose as small as possible we can write: lim P [ X n µ < ɛ ] = 1 n 78

CLT - Example Let X 1,X 2,...,X 12 be an iid random sample each distributed as U (0,b), b > 0. Using CLT show that the probability P( b 4 < X < 3b 4 ) is approximately 0.9973. Answer: These 12 iid rvs come from the same population. So we need to find the population mean and variance first. For Uniform(a, b) distribution these quantities are Hence in our example µ x = b+a 2, σ2 x = (b a)2 12 µ x = b 2, σ2 x = b2 12 79

CLT - Example Let X 1,X 2,...,X 12 be an iid random sample each distributed as U (0,b), b > 0. Using CLT show that the probability P( b 4 < X < 3b 4 ) is approximately 0.9973. Answer: These 12 iid rvs come from the same population. So we need to find the population mean and variance first. For Uniform(a, b) distribution these quantities are Hence in our example µ x = b+a 2, σ2 x = (b a)2 12 µ x = b 2, σ2 x = b2 12 79

CLT - Example, cont d Using CLT we have: P ( b 4 < X < 3b ) 4 Var(X) = σ2 x n = b2 144 ( b 4 = P b 2 < X µ 3b x b 12 σ 2 x /n < 4 ) b 2 b 12 = P( 3 < Z < 3) = Φ(3) (1 Φ(3)) = 0.99865 (1 0.99865) = 0.9973 80

Chi-square Distribution Square of a standard normal random variable is distributed as chi-square with 1 degrees of freedom: If Z N(0,1) then Z 2 χ 2 1 Sum of squares of n independent standard normal variables is distributed as a chi-squared with n degrees of freedom: n If Z i i.i.d. N(0,1) i = 1,2,...,n then Zi 2 χ 2 n Chi-square distribution has one parameter: ν (nu) degrees of freedom. For χ 2 ν the pdf is 1 f(x) = 2 ν/2 Γ(ν/2) xν/2 1 e x/2, x > 0, ν > 0 where Γ is the gamma distribution: Γ(α) = 0 x α 1 e x dx, α > 0 81 i=1

Chi-square Distribution Square of a standard normal random variable is distributed as chi-square with 1 degrees of freedom: If Z N(0,1) then Z 2 χ 2 1 Sum of squares of n independent standard normal variables is distributed as a chi-squared with n degrees of freedom: n If Z i i.i.d. N(0,1) i = 1,2,...,n then Zi 2 χ 2 n Chi-square distribution has one parameter: ν (nu) degrees of freedom. For χ 2 ν the pdf is 1 f(x) = 2 ν/2 Γ(ν/2) xν/2 1 e x/2, x > 0, ν > 0 where Γ is the gamma distribution: Γ(α) = 0 x α 1 e x dx, α > 0 81 i=1

Chi-square Distribution Square of a standard normal random variable is distributed as chi-square with 1 degrees of freedom: If Z N(0,1) then Z 2 χ 2 1 Sum of squares of n independent standard normal variables is distributed as a chi-squared with n degrees of freedom: n If Z i i.i.d. N(0,1) i = 1,2,...,n then Zi 2 χ 2 n Chi-square distribution has one parameter: ν (nu) degrees of freedom. For χ 2 ν the pdf is 1 f(x) = 2 ν/2 Γ(ν/2) xν/2 1 e x/2, x > 0, ν > 0 where Γ is the gamma distribution: Γ(α) = 0 x α 1 e x dx, α > 0 81 i=1

Chi-square Distribution Let χ 2 ν be a chi-square random variable with ν degrees of freedom. Then the expected value and variance are: E(χ 2 ν ) = ν ve Var(χ2 ν ) = 2ν If the population is normal then the ratio of the sum of squared deviations from the sample mean to population variance has a chi-square distribution with n 1 degrees of freedom: (n 1)s 2 n i=1 σ 2 = (X i X) 2 σ 2 χ 2 n 1 Meaning of dof: the knowledge of n quantities (X i X) is equivalent to (n 1) mathematically independent components. Since we first estimate the sample mean from the observation set there are n 1 mathematically independent terms. 82

Chi-square Distribution Let χ 2 ν be a chi-square random variable with ν degrees of freedom. Then the expected value and variance are: E(χ 2 ν ) = ν ve Var(χ2 ν ) = 2ν If the population is normal then the ratio of the sum of squared deviations from the sample mean to population variance has a chi-square distribution with n 1 degrees of freedom: (n 1)s 2 n i=1 σ 2 = (X i X) 2 σ 2 χ 2 n 1 Meaning of dof: the knowledge of n quantities (X i X) is equivalent to (n 1) mathematically independent components. Since we first estimate the sample mean from the observation set there are n 1 mathematically independent terms. 82

Chi-square Distribution Let χ 2 ν be a chi-square random variable with ν degrees of freedom. Then the expected value and variance are: E(χ 2 ν ) = ν ve Var(χ2 ν ) = 2ν If the population is normal then the ratio of the sum of squared deviations from the sample mean to population variance has a chi-square distribution with n 1 degrees of freedom: (n 1)s 2 n i=1 σ 2 = (X i X) 2 σ 2 χ 2 n 1 Meaning of dof: the knowledge of n quantities (X i X) is equivalent to (n 1) mathematically independent components. Since we first estimate the sample mean from the observation set there are n 1 mathematically independent terms. 82

Chi-square Distribution Skewed to right, i.e., right tail is longer than the left. This means that the 3rd standard moment is positive. As the degrees of freedom parameter ν gets bigger distribution becomes more symmetric. In the limit, it converges to the normal distribution. Chi-square probabilities can easily be calculated using appropriate tables. 83

Chi-square Distribution Skewed to right, i.e., right tail is longer than the left. This means that the 3rd standard moment is positive. As the degrees of freedom parameter ν gets bigger distribution becomes more symmetric. In the limit, it converges to the normal distribution. Chi-square probabilities can easily be calculated using appropriate tables. 83

Chi-square Distribution Skewed to right, i.e., right tail is longer than the left. This means that the 3rd standard moment is positive. As the degrees of freedom parameter ν gets bigger distribution becomes more symmetric. In the limit, it converges to the normal distribution. Chi-square probabilities can easily be calculated using appropriate tables. 83

Chi-square Distribution 0.5 ν serbestlik dereceli ki kare dagilimi 0.45 0.4 0.35 ν = 2 0.3 0.25 0.2 ν = 4 0.15 ν = 6 0.1 ν = 12 0.05 0 0 5 10 15 20 25 30 84

Chi-square Distribution 0.1 ν=10 serbestlik dereceli ki kare dagiliminin olasilik yog.fonks. 0.09 0.08 0.07 0.06 2 f(χ 10 ) 0.05 0.04 0.03 2 P(χ >15.99) = 0.10 10 0.02 0.01 0 0 5 10 15 20 25 30 χ 2 (10) 85

Student t Distribution Let Z and Y be two rvs defined as follows: Z N(0,1), and Y χ 2 ν. The random variable defined below follows a t distribution with ν degrees of freedom: t ν = Z Y/ν t ν The rv t ν has a t distribution with ν dof defined in the denominator. PDF is given by: f(t) = Γ( ) ν+1 2 Γ(ν/2) πν 1 (1+(t 2 /ν)) (ν+1)/2, < t < It has one parameter (ν) and it has symmetric shape. For E(t ν ) = 0 and ν 3 Var(t ν ) = ν/(ν 2) ν, t ν N(0,1) 86

Student t Distribution Let Z and Y be two rvs defined as follows: Z N(0,1), and Y χ 2 ν. The random variable defined below follows a t distribution with ν degrees of freedom: t ν = Z Y/ν t ν The rv t ν has a t distribution with ν dof defined in the denominator. PDF is given by: f(t) = Γ( ) ν+1 2 Γ(ν/2) πν 1 (1+(t 2 /ν)) (ν+1)/2, < t < It has one parameter (ν) and it has symmetric shape. For E(t ν ) = 0 and ν 3 Var(t ν ) = ν/(ν 2) ν, t ν N(0,1) 86

Student t Distribution Let Z and Y be two rvs defined as follows: Z N(0,1), and Y χ 2 ν. The random variable defined below follows a t distribution with ν degrees of freedom: t ν = Z Y/ν t ν The rv t ν has a t distribution with ν dof defined in the denominator. PDF is given by: f(t) = Γ( ) ν+1 2 Γ(ν/2) πν 1 (1+(t 2 /ν)) (ν+1)/2, < t < It has one parameter (ν) and it has symmetric shape. For E(t ν ) = 0 and ν 3 Var(t ν ) = ν/(ν 2) ν, t ν N(0,1) 86

Student t Distribution 0.4 Standart Normal t ν=1 ν 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 5 4 3 2 1 0 1 2 3 4 5 87

Student t Distribution 0.4 0.35 Standart Normal t ν ν=1 t ν ν=2 0.3 0.25 0.2 0.15 0.1 0.05 0 5 4 3 2 1 0 1 2 3 4 5 88

Student t Distribution 0.4 Standart Normal t ν ν=1 0.35 t ν ν=2 t ν ν=3 0.3 0.25 0.2 0.15 0.1 0.05 0 5 4 3 2 1 0 1 2 3 4 5 89

Student t Distribution 0.4 Standart Normal t ν ν=1 0.35 t ν ν=2 t ν ν=3 t ν ν=4 0.3 0.25 0.2 0.15 0.1 0.05 0 5 4 3 2 1 0 1 2 3 4 5 90

Student t Distribution 0.4 Standart Normal t ν ν=1 0.35 t ν ν=2 t ν ν=3 t ν ν=4 0.3 t ν ν=5 0.25 0.2 0.15 0.1 0.05 0 5 4 3 2 1 0 1 2 3 4 5 91

Student t Distribution 0.4 Standart Normal t ν ν=1 0.35 t ν ν=2 t ν ν=3 t ν ν=4 0.3 t ν ν=5 t ν ν=10 0.25 0.2 0.15 0.1 0.05 0 5 4 3 2 1 0 1 2 3 4 5 92

Student t Distribution 0.4 Standart Normal t ν ν=1 0.35 t ν ν=2 t ν ν=3 t ν ν=4 0.3 t ν ν=5 t ν ν=10 t ν ν=20 0.25 0.2 0.15 0.1 0.05 0 5 4 3 2 1 0 1 2 3 4 5 93

Student t Distribution 0.4 Standart Normal t ν ν=1 0.35 t ν ν=2 t ν ν=3 t ν ν=4 0.3 t ν ν=5 t ν ν=10 t ν ν=20 0.25 t ν ν=30 0.2 0.15 0.1 0.05 0 5 4 3 2 1 0 1 2 3 4 5 94

Calculating Student t Probabilities 0.4 5 s.d. Student t dagilimi 0.35 0.3 0.25 0.2 0.15 0.1 0.05 1 α P(t 5 >2.015)=α=0.05 0 5 4 3 2 1 0 1 2 3 4 5 α t 5 95

Calculating Student t Probabilities 0.4 5 s.d. t dagilimi ve standard normal dagilim 0.35 0.3 Std Normal 0.25 0.2 0.15 0.1 t 5 0.05 0 5 4 3 2 1 0 1 2 3 4 5 96

Calculating Student t Probabilities P( 2.228 < t 10 < 2.228) = 1 0.025 0.025 = 0.95 0.4 10 s.d. Student t dagilimi 0.35 0.3 0.25 0.2 0.15 0.1 P( 2.228<t 10 <2.228) = 1 α 0.05 α = 0.025 1 α = 0.95 α = 0.025 0 5 4 3 2 1 0 1 2 3 4 5 2.228 2.228 97

F Distribution Let X 1 χ 2 k 1 and X 2 χ 2 k 2 be two independent chi-square random variables. Then the following random variable has an F distribution F = X 1/k 1 X 2 /k 2 F(k 1,k 2 ) F distribution has two parameters: k 1 is the numerator degrees of freedom k 2 is denominator degrees of freedom 98

F Distribution Let X 1 χ 2 k 1 and X 2 χ 2 k 2 be two independent chi-square random variables. Then the following random variable has an F distribution F = X 1/k 1 X 2 /k 2 F(k 1,k 2 ) F distribution has two parameters: k 1 is the numerator degrees of freedom k 2 is denominator degrees of freedom 98

F Distribution Let X 1 χ 2 k 1 and X 2 χ 2 k 2 be two independent chi-square random variables. Then the following random variable has an F distribution F = X 1/k 1 X 2 /k 2 F(k 1,k 2 ) F distribution has two parameters: k 1 is the numerator degrees of freedom k 2 is denominator degrees of freedom 98

F Distribution Let X 1 χ 2 k 1 and X 2 χ 2 k 2 be two independent chi-square random variables. Then the following random variable has an F distribution F = X 1/k 1 X 2 /k 2 F(k 1,k 2 ) F distribution has two parameters: k 1 is the numerator degrees of freedom k 2 is denominator degrees of freedom 98

F Distribution Let X 1 χ 2 k 1 and X 2 χ 2 k 2 be two independent chi-square random variables. Then the following random variable has an F distribution F = X 1/k 1 X 2 /k 2 F(k 1,k 2 ) F distribution has two parameters: k 1 is the numerator degrees of freedom k 2 is denominator degrees of freedom 98

F Distribution 0.9 0.8 0.7 F(2,8) F(6,8) F(6,20) F(10,30) 0.6 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 99