y = 1 N y i = 1 N y. E(W )=a 1 µ 1 + a 2 µ a n µ n (1.2) and its variance is var(w )=a 2 1σ a 2 2σ a 2 nσ 2 n. (1.

Size: px
Start display at page:

Download "y = 1 N y i = 1 N y. E(W )=a 1 µ 1 + a 2 µ a n µ n (1.2) and its variance is var(w )=a 2 1σ a 2 2σ a 2 nσ 2 n. (1."

Transcription

1 The probability density function of a continuous random variable Y (or the probability mass function if Y is discrete) is referred to simply as a probability distribution and denoted by f(y; θ) where θ represents the parameters ofthe distribution. We use dot ( ) subscripts for summation and bars ( ) for means, thus y = 1 N N y i = 1 N y. The expected value and variance ofa random variable Y are denoted by E(Y ) and var(y ) respectively. Suppose random variables Y 1,..., Y N are independent with E(Y i )=µ i and var(y i )=σ 2 i for i =1,..., n. Let the random variable W be a linear combination ofthe Y i s W = a 1 Y 1 + a 2 Y a n Y n, (1.1) where the a i s are constants. Then the expected value of W is and its variance is E(W )=a 1 µ 1 + a 2 µ a n µ n (1.2) var(w )=a 2 1σ a 2 2σ a 2 nσ 2 n. (1.3) 1.4Distributions related to the Normal distribution The sampling distributions ofmany ofthe estimators and test statistics used in this book depend on the Normal distribution. They do so either directly because they are derived from Normally distributed random variables, or asymptotically, via the Central Limit Theorem for large samples. In this section we give definitions and notation for these distributions and summarize the relationships between them. The exercises at the end ofthe chapter provide practice in using these results which are employed extensively in subsequent chapters Normal distributions 1. Ifthe random variable Y has the Normal distribution with mean µ and variance σ 2, its probability density function is f(y; µ, σ 2 )= We denote this by Y N(µ, σ 2 ). [ 1 exp 1 2πσ 2 2 ( ) ] 2 y µ. 2. The Normal distribution with µ = 0 and σ 2 =1,Y N(0, 1), is called the standard Normal distribution. σ 2

2 3. Let Y 1,..., Y n denote Normally distributed random variables with Y i N(µ i,σ 2 i )fori =1,..., n and let the covariance of Y i and Y j be denoted by cov(y i,y j )=ρ ij σ i σ j, where ρ ij is the correlation coefficient for Y i and Y j. Then the joint distribution ofthe Y i s is the multivariate Normal distribution with mean vector µ =[µ 1,..., µ n ] T and variance-covariance matrix V with diagonal elements σi 2 and non-diagonal elements ρ ijσ i σ j for i j. We write this as y N(µ, V), where y=[y 1,..., Y n ] T. 4. Suppose the random variables Y 1,..., Y n are independent and Normally distributed with the distributions Y i N(µ i,σi 2 )fori =1,..., n. If W = a 1 Y 1 + a 2 Y a n Y n, where the a i s are constants. Then W is also Normally distributed, so that ( n n ) n W = a i Y i N a i µ i, a 2 i σi 2 by equations (1.2) and (1.3) Chi-squared distribution 1. The central chi-squared distribution with n degrees offreedom is defined as the sum ofsquares ofn independent random variables Z 1,..., Z n each with the standard Normal distribution. It is denoted by n X 2 = Zi 2 χ 2 (n). In matrix notation, if z =[Z 1,..., Z n ] T then z T z = n Z2 i so that X 2 = z T z χ 2 (n). 2. If X 2 has the distribution χ 2 (n), then its expected value is E(X 2 )=n and its variance is var(x 2 )=2n. 3. If Y 1,..., Y n are independent Normally distributed random variables each with the distribution Y i N(µ i,σi 2) then n ( ) 2 X 2 Yi µ i = χ 2 (n) (1.4) because each ofthe variables Z i =(Y i µ i ) /σ i has the standard Normal distribution N(0, 1). 4. Let Z 1,..., Z n be independent random variables each with the distribution N(0, 1) and let Y i = Z i + µ i, where at least one ofthe µ i s is non-zero. Then the distribution of Y 2 i = (Z i + µ i ) 2 = Zi 2 +2 Z i µ i + µ 2 i σ i

3 has larger mean n + λ and larger variance 2n +4λ than χ 2 (n) where λ = µ 2 i. This is called the non-central chi-squared distribution with n degrees offreedom and non-centrality parameter λ. It is denoted by χ 2 (n, λ). 5. Suppose that the Y i s are not necessarily independent and the vector y = [Y 1,...,Y n ] T has the multivariate normal distribution y N(µ, V) where the variance-covariance matrix V is non-singular and its inverse is V 1. Then X 2 =(y µ) T V 1 (y µ) χ 2 (n). (1.5) 6. More generally if y N(µ, V) then the random variable y T V 1 y has the non-central chi-squared distribution χ 2 (n, λ) where λ = µ T V 1 µ. 7. If X1 2,...,Xm 2 are m independent random variables with the chi-squared distributions Xi 2 χ2 (n i,λ i ), which may or may not be central, then their sum also has a chi-squared distribution with n i degrees offreedom and non-centrality parameter λ i, i.e., ( m m ) m Xi 2 χ 2 n i, λ i. This is called the reproductive property ofthe chi-squared distribution. 8. Let y N(µ, V), where y has n elements but the Y i s are not independent so that V is singular with rank k<nand the inverse of V is not uniquely defined. Let V denote a generalized inverse of V. Then the random variable y T V y has the non-central chi-squared distribution with k degrees of freedom and non-centrality parameter λ = µ T V µ. For further details about properties of the chi-squared distribution see Rao (1973, Chapter 3) t-distribution The t-distribution with n degrees offreedom is defined as the ratio oftwo independent random variables. The numerator has the standard Normal distribution and the denominator is the square root ofa central chi-squared random variable divided by its degrees offreedom; that is, Z T = (1.6) (X 2 /n) 1/2 where Z N(0, 1), X 2 χ 2 (n) and Z and X 2 are independent. This is denoted by T t(n) F-distribution 1. The central F-distribution with n and m degrees offreedom is defined as the ratio oftwo independent central chi-squared random variables each

4 divided by its degrees offreedom, F = X2 1 n /X2 2 m (1.7) where X 2 1 χ 2 (n),x 2 2 χ 2 (m) and X 2 1 and X 2 2 are independent. This is denoted by F F (n, m). 2. The relationship between the t-distribution and the F-distribution can be derived by squaring the terms in equation (1.6) and using definition (1.7) to obtain T 2 = Z2 1 /X2 F (1,n), (1.8) n that is, the square ofa random variable with the t-distribution, t(n), has the F-distribution, F (1,n). 3. The non-central F-distribution is defined as the ratio oftwo independent random variables, each divided by its degrees offreedom, where the numerator has a non-central chi-squared distribution and the denominator has a central chi-squared distribution, i.e., F = X2 1 n /X2 2 m where X 2 1 χ 2 (n, λ) with λ = µ T V 1 µ,x 2 2 χ 2 (m) and X 2 1 and X 2 2 are independent. The mean ofa non-central F-distribution is larger than the mean ofcentral F-distribution with the same degrees offreedom. 1.5 Quadratic forms 1. A quadratic form is a polynomial expression in which each term has degree 2. Thus y1 2 + y2 2 and 2y1 2 + y2 2 +3y 1 y 2 are quadratic forms in y 1 and y 2 but y1 2 + y2 2 +2y 1 or y1 2 +3y are not. 2. Let A be a symmetric matrix a 11 a 12 a 1n a 21 a 22 a 2n..... a n1 a n2 a nn where a ij = a ji, then the expression y T Ay = i j a ijy i y j is a quadratic form in the y i s. The expression (y µ) T V 1 (y µ) is a quadratic form in the terms (y i µ i ) but not in the y i s. 3. The quadratic form y T Ay and the matrix A are said to be positive definite if y T Ay > 0 whenever the elements of y are not all zero. A necessary and sufficient condition for positive definiteness is that all the determinants A 1 = a 11, A 2 = a 11 a 12 a 21 a 22, A a 11 a 12 a 13 3 = a 21 a 22 a 23,..., and a 31 a 32 a 33

5 A n = det A are all positive. 4. The rank ofthe matrix A is also called the degrees offreedom ofthe quadratic form Q = y T Ay. 5. Suppose Y 1,..., Y n are independent random variables each with the Normal distribution N(0,σ 2 ). Let Q = n Y i 2 and let Q 1,..., Q k be quadratic forms in the Y i s such that Q = Q Q k where Q i has m i degrees offreedom (i =1,...,k). Then Q 1,..., Q k are independent random variables and Q 1 /σ 2 χ 2 (m 1 ),Q 2 /σ 2 χ 2 (m 2 ), and Q k /σ 2 χ 2 (m k ), ifand only if, m 1 + m m k = n. This is Cochran s theorem; for a proof see, for example, Hogg and Craig (1995). A similar result holds for non-central distributions; see Chapter 3 ofrao (1973). 6. A consequence ofcochran s theorem is that the difference oftwo independent random variables, X1 2 χ 2 (m) and X2 2 χ 2 (k), also has a chi-squared distribution provided that X 2 0 and m>k. X 2 = X 2 1 X 2 2 χ 2 (m k) 1.6 Estimation Maximum likelihood estimation Let y =[Y 1,..., Y n ] T denote a random vector and let the joint probability density function of the Y i s be f(y; θ) which depends on the vector ofparameters θ =[θ 1,..., θ p ] T. The likelihood function L(θ; y) is algebraically the same as the joint probability density function f(y; θ) but the change in notation reflects a shift ofemphasis from the random variables y, with θ fixed, to the parameters θ with y fixed. Since L is defined in terms ofthe random vector y, it is itselfa random variable. Let Ω denote the set ofall possible values ofthe parameter vector θ; Ω is called the parameter space. The maximum likelihood estimator of θ is the value θ which maximizes the likelihood function, that is L( θ; y) L(θ; y) for all θ in Ω. Equivalently, θ is the value which maximizes the log-likelihood function

6 l(θ; y) = log L(θ; y), since the logarithmic function is monotonic. Thus l( θ; y) l(θ; y) for all θ in Ω. Often it is easier to work with the log-likelihood function than with the likelihood function itself. Usually the estimator θ is obtained by differentiating the log-likelihood function with respect to each element θ j of θ and solving the simultaneous equations l(θ; y) θ j = 0 for j =1,..., p. (1.9) It is necessary to check that the solutions do correspond to maxima of l(θ; y) by verifying that the matrix of second derivatives 2 l(θ; y) θ j θ k evaluated at θ = θ is negative definite. For example, if θ has only one element θ this means it is necessary to check that [ 2 ] l(θ, y) θ 2 < 0. It is also necessary to check ifthere are any values ofθ at the edges ofthe parameter space Ω that give local maxima of l(θ; y). When all local maxima have been identified, the value of θ corresponding to the largest one is the maximum likelihood estimator. (For most ofthe models considered in this book there is only one maximum and it corresponds to the solution ofthe equations l/ θ j =0,j =1,..., p.) An important property ofmaximum likelihood estimators is that ifg(θ) is any function of the parameters θ, then the maximum likelihood estimator of g(θ) isg( θ). This follows from the definition of θ. It is sometimes called the invariance property ofmaximum likelihood estimators. A consequence is that we can work with a function of the parameters that is convenient for maximum likelihood estimation and then use the invariance property to obtain maximum likelihood estimates for the required parameters. In principle, it is not necessary to be able to find the derivatives ofthe likelihood or log-likelihood functions or to solve equation (1.9) if θ can be found numerically. In practice, numerical approximations are very important for generalized linear models. Other properties ofmaximum likelihood estimators include consistency, sufficiency, asymptotic efficiency and asymptotic normality. These are discussed in books such as Cox and Hinkley (1974) or Kalbfleisch (1985, Chapters 1 and 2). θ= θ

7 1.6.2 Example: Poisson distribution Let Y 1,..., Y n be independent random variables each with the Poisson distribution e θ f(y i ; θ) = θyi, y i =0, 1, 2,... y i! with the same parameter θ. Their joint distribution is f(y 1,...,y n ; θ) = n e θ f(y i ; θ) = θy1 y 1! θy2 e θ y 2! θyn e θ y n! = θσ yi e nθ y 1!y 2!...y n!. This is also the likelihood function L(θ; y 1,..., y n ). It is easier to use the loglikelihood function l(θ; y 1,..., y n ) = log L(θ; y 1,..., y n )=( y i ) log θ nθ (log y i!). To find the maximum likelihood estimate θ, use dl dθ = 1 yi n. θ Equate this to zero to obtain the solution θ = yi /n = y. Since d 2 l/dθ 2 = y i /θ 2 < 0, l has its maximum value when θ = θ, confirming that y is the maximum likelihood estimate Least Squares Estimation Let Y 1,..., Y n be independent random variables with expected values µ 1,..., µ n respectively. Suppose that the µ i s are functions of the parameter vector that we want to estimate, β =[β 1,..., β p ] T,p<n.Thus E(Y i )=µ i (β). The simplest form of the method of least squares consists offinding the estimator β that minimizes the sum ofsquares ofthe differences between Y i s and their expected values S = [Y i µ i (β)] 2. Usually β is obtained by differentiating S with respect to each element β j of β and solving the simultaneous equations S =0, j =1,..., p. β j Ofcourse it is necessary to check that the solutions correspond to minima

8 (i.e., the matrix ofsecond derivatives is positive definite) and to identify the global minimum from among these solutions and any local minima at the boundary ofthe parameter space. Now suppose that the Y i s have variances σi 2 that are not all equal. Then it may be desirable to minimize the weighted sum ofsquared differences S = w i [Y i µ i (β)] 2 where the weights are w i =(σ 2 i ) 1. In this way, the observations which are less reliable (that is, the Y i s with the larger variances) will have less influence on the estimates. More generally, let y =[Y 1,..., Y n ] T denote a random vector with mean vector µ =[µ 1,..., µ n ] T and variance-covariance matrix V. Then the weighted least squares estimator is obtained by minimizing S =(y µ) T V 1 (y µ) Comments on estimation. 1. An important distinction between the methods ofmaximum likelihood and least squares is that the method ofleast squares can be used without making assumptions about the distributions ofthe response variables Y i beyond specifying their expected values and possibly their variance-covariance structure. In contrast, to obtain maximum likelihood estimators we need to specify the joint probability distribution of the Y i s. 2. For many situations maximum likelihood and least squares estimators are identical. 3. Often numerical methods rather than calculus may be needed to obtain parameter estimates that maximize the likelihood or log-likelihood function or minimize the sum ofsquares. The following example illustrates this approach Example: Tropical cyclones Table 1.2 shows the number oftropical cyclones in Northeastern Australia for the seasons (season 1) to (season 13), a period of fairly consistent conditions for the definition and tracking of cyclones (Dobson and Stewart, 1974). Table 1.2 Numbers of tropical cyclones in 13 successive seasons. Season: No. ofcyclones Let Y i denote the number ofcyclones in season i, where i =1,...,13. Suppose the Y i s are independent random variables with the Poisson distribution

9 Figure 1.1 Graph showingthe location of the maximum likelihood estimate for the data in Table 1.2 on tropical cyclones. with parameter θ. From Example θ = y =72/13= An alternative approach would be to find numerically the value of θ that maximizes the log-likelihood function. The component of the log-likelihood function due to y i is l i = y i log θ θ log y i!. The log-likelihood function is the sum of these terms l = 13 l i = 13 (y i log θ θ log y i!). Only the first two terms in the brackets involve θ and so are relevant to the optimization calculation, because the term 13 1 log y i! is a constant. To plot the log-likelihood function (without the constant term) against θ, for various values of θ, calculate (y i logθ θ) for each y i and add the results to obtain l = (y i logθ θ). Figure 1.1 shows l plotted against θ. Clearly the maximum value is between θ = 5 and θ =6. This can provide astarting point for an iterative procedure for obtaining θ. The results of asimple bisection calculation are shown in Table 1.3. The function l is first calculated for approximations θ (1) = 5 and θ (2) = 6. Then subsequent approximations θ (k) for k =3, 4,... are the average values ofthe two previous estimates of θ with the largest values of l (for example, θ (6) = 1 2 (θ(5) + θ (3) )). After 7 steps this process gives θ 5.54 which is correct to 2 decimal places. 1.7 Exercises 1.1 Let Y 1 and Y 2 be independent random variables with Y 1 N(1, 3) and Y 2 N(2, 5). If W 1 = Y 1 +2Y 2 and W 2 =4Y 1 Y 2 what is the joint distribution of W 1 and W 2? 1.2 Let Y 1 and Y 2 be independent random variables with Y 1 N(0, 1) and Y 2 N(3, 4).

10 Table 1.3 Successive approximations to the maximum likelihood estimate of the mean number of cyclones per season. k θ (k) l (a) What is the distribution of Y1 2? [ ] Y (b) If y = 1, obtain an expression for y (Y 2 3)/2 T y.what is its distribution? ( ) Y1 (c) If y = and its distribution is y N(µ, V), obtain an expression Y 2 for y T V 1 y. What is its distribution? 1.3 Let the joint distribution of Y 1 and Y 2 be N(µ, V) with ( ) ( ) µ = and V = (a) Obtain an expression for (y µ) T V 1 (y µ). What is its distribution? (b) Obtain an expression for y T V 1 y. What is its distribution? 1.4 Let Y 1,..., Y n be independent random variables each with the distribution N(µ, σ 2 ). Let Y = 1 n Y i and S 2 = 1 n (Y i Y ) 2. n n 1 (a) What is the distribution of Y? (b) Show that S 2 = 1 [ n n 1 (Y i µ) 2 n(y µ) 2]. (c) From (b) it follows that (Y i µ) 2 /σ 2 =(n 1)S 2 /σ 2 + [ (Y µ) 2 n/σ 2]. How does this allow you to deduce that Y and S 2 are independent? (d) What is the distribution of(n 1)S 2 /σ 2? (e) What is the distribution of Y µ S/ n?

11 Table 1.4 Progeny of light brown apple moths. Progeny Females Males group This exercise is a continuation ofthe example in Section in which Y 1,..., Y n are independent Poisson random variables with the parameter θ. (a) Show that E(Y i )=θ for i =1,..., n. (b) Suppose θ = e β. Find the maximum likelihood estimator of β. (c) Minimize S = ( Y i e β) 2 to obtain a least squares estimator of β. 1.6 The data below are the numbers offemales and males in the progeny of 16 female light brown apple moths in Muswellbrook, New South Wales, Australia (from Lewis, 1987). (a) Calculate the proportion offemales in each ofthe 16 groups ofprogeny. (b) Let Y i denote the number offemales and n i the number ofprogeny in each group (i = 1,..., 16). Suppose the Y i s are independent random variables each with the binomial distribution ( ) ni f(y i ; θ) = θ yi (1 θ) ni yi. y i Find the maximum likelihood estimator of θ using calculus and evaluate it for these data. (c) Use a numerical method to estimate θ and compare the answer with the one from (b).

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Final Examination Statistics 200C. T. Ferguson June 11, 2009

Final Examination Statistics 200C. T. Ferguson June 11, 2009 Final Examination Statistics 00C T. Ferguson June, 009. (a) Define: X n converges in probability to X. (b) Define: X m converges in quadratic mean to X. (c) Show that if X n converges in quadratic mean

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. 1 Converting Matrices Into (Long) Vectors Convention:

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

18 Bivariate normal distribution I

18 Bivariate normal distribution I 8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38 Preliminaries Copyright c 2018 Dan Nettleton (Iowa State University) Statistics 510 1 / 38 Notation for Scalars, Vectors, and Matrices Lowercase letters = scalars: x, c, σ. Boldface, lowercase letters

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

STA 260: Statistics and Probability II

STA 260: Statistics and Probability II Al Nosedal. University of Toronto. Winter 2017 1 Properties of Point Estimators and Methods of Estimation 2 3 If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition

More information

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30 Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices

More information

Lecture 20: Linear model, the LSE, and UMVUE

Lecture 20: Linear model, the LSE, and UMVUE Lecture 20: Linear model, the LSE, and UMVUE Linear Models One of the most useful statistical models is X i = β τ Z i + ε i, i = 1,...,n, where X i is the ith observation and is often called the ith response;

More information

Errata, Mahler Study Aids for Exam S, 2017 HCM, 10/24/17 Page 1. 1, page 2, in Table of Contents: Section 33 is Specific Examples of Markov Processes.

Errata, Mahler Study Aids for Exam S, 2017 HCM, 10/24/17 Page 1. 1, page 2, in Table of Contents: Section 33 is Specific Examples of Markov Processes. Errata, Mahler Study Aids for Exam S, 2017 HCM, 10/24/17 Page 1 1, page 2, in Table of Contents: Section 33 is Specific Examples of Markov Processes. 1, page 24, solution ot the exercise: (60%)(100) +

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics Chapter Three. Point Estimation 3.4 Uniformly Minimum Variance Unbiased Estimator(UMVUE) Criteria for Best Estimators MSE Criterion Let F = {p(x; θ) : θ Θ} be a parametric distribution

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical

More information

The Multivariate Normal Distribution 1

The Multivariate Normal Distribution 1 The Multivariate Normal Distribution 1 STA 302 Fall 2014 1 See last slide for copyright information. 1 / 37 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Lecture 28: Asymptotic confidence sets

Lecture 28: Asymptotic confidence sets Lecture 28: Asymptotic confidence sets 1 α asymptotic confidence sets Similar to testing hypotheses, in many situations it is difficult to find a confidence set with a given confidence coefficient or level

More information

SOME THEOREMS ON QUADRATIC FORMS AND NORMAL VARIABLES. (2π) 1 2 (σ 2 ) πσ

SOME THEOREMS ON QUADRATIC FORMS AND NORMAL VARIABLES. (2π) 1 2 (σ 2 ) πσ SOME THEOREMS ON QUADRATIC FORMS AND NORMAL VARIABLES 1. THE MULTIVARIATE NORMAL DISTRIBUTION The n 1 vector of random variables, y, is said to be distributed as a multivariate normal with mean vector

More information

HT Introduction. P(X i = x i ) = e λ λ x i

HT Introduction. P(X i = x i ) = e λ λ x i MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework

More information

Let X and Y denote two random variables. The joint distribution of these random

Let X and Y denote two random variables. The joint distribution of these random EE385 Class Notes 9/7/0 John Stensby Chapter 3: Multiple Random Variables Let X and Y denote two random variables. The joint distribution of these random variables is defined as F XY(x,y) = [X x,y y] P.

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84 Chapter 2 84 Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random variables. For instance, X = X 1 X 2.

More information

ISQS 5349 Final Exam, Spring 2017.

ISQS 5349 Final Exam, Spring 2017. ISQS 5349 Final Exam, Spring 7. Instructions: Put all answers on paper other than this exam. If you do not have paper, some will be provided to you. The exam is OPEN BOOKS, OPEN NOTES, but NO ELECTRONIC

More information

Chapter 5. The multivariate normal distribution. Probability Theory. Linear transformations. The mean vector and the covariance matrix

Chapter 5. The multivariate normal distribution. Probability Theory. Linear transformations. The mean vector and the covariance matrix Probability Theory Linear transformations A transformation is said to be linear if every single function in the transformation is a linear combination. Chapter 5 The multivariate normal distribution When

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Limiting Distributions

Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the two fundamental results

More information

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we

More information

Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets

Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets Confidence sets X: a sample from a population P P. θ = θ(p): a functional from P to Θ R k for a fixed integer k. C(X): a confidence

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Test Problems for Probability Theory ,

Test Problems for Probability Theory , 1 Test Problems for Probability Theory 01-06-16, 010-1-14 1. Write down the following probability density functions and compute their moment generating functions. (a) Binomial distribution with mean 30

More information

The Multivariate Normal Distribution 1

The Multivariate Normal Distribution 1 The Multivariate Normal Distribution 1 STA 302 Fall 2017 1 See last slide for copyright information. 1 / 40 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2

More information

2.3 Methods of Estimation

2.3 Methods of Estimation 96 CHAPTER 2. ELEMENTS OF STATISTICAL INFERENCE 2.3 Methods of Estimation 2.3. Method of Moments The Method of Moments is a simple technique based on the idea that the sample moments are natural estimators

More information

MATH4427 Notebook 2 Fall Semester 2017/2018

MATH4427 Notebook 2 Fall Semester 2017/2018 MATH4427 Notebook 2 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Statistics, Data Analysis, and Simulation SS 2015

Statistics, Data Analysis, and Simulation SS 2015 Statistics, Data Analysis, and Simulation SS 2015 08.128.730 Statistik, Datenanalyse und Simulation Dr. Michael O. Distler Mainz, 27. April 2015 Dr. Michael O. Distler

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

MATH5745 Multivariate Methods Lecture 07

MATH5745 Multivariate Methods Lecture 07 MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

STT 843 Key to Homework 1 Spring 2018

STT 843 Key to Homework 1 Spring 2018 STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 2016 MODULE 1 : Probability distributions Time allowed: Three hours Candidates should answer FIVE questions. All questions carry equal marks.

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

Limiting Distributions

Limiting Distributions Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Advanced Quantitative Methods: maximum likelihood

Advanced Quantitative Methods: maximum likelihood Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

MAS3301 Bayesian Statistics

MAS3301 Bayesian Statistics MAS3301 Bayesian Statistics M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9 1 11 Conjugate Priors IV: The Dirichlet distribution and multinomial observations 11.1

More information

Generalized linear models

Generalized linear models Generalized linear models Søren Højsgaard Department of Mathematical Sciences Aalborg University, Denmark October 29, 202 Contents Densities for generalized linear models. Mean and variance...............................

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Five Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Five Notes Spring 2011 1 / 37 Outline 1

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Multivariate Distributions

Multivariate Distributions Chapter 2 Multivariate Distributions and Transformations 2.1 Joint, Marginal and Conditional Distributions Often there are n random variables Y 1,..., Y n that are of interest. For example, age, blood

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 10 December 17, 01 Silvia Masciocchi, GSI Darmstadt Winter Semester 01 / 13 Method of least squares The method of least squares is a standard approach to

More information

Random Vectors and Multivariate Normal Distributions

Random Vectors and Multivariate Normal Distributions Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random 75 variables. For instance, X = X 1 X 2., where each

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B Statistics STAT:5 (22S:93), Fall 25 Sample Final Exam B Please write your answers in the exam books provided.. Let X, Y, and Y 2 be independent random variables with X N(µ X, σ 2 X ) and Y i N(µ Y, σ 2

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

4. Distributions of Functions of Random Variables

4. Distributions of Functions of Random Variables 4. Distributions of Functions of Random Variables Setup: Consider as given the joint distribution of X 1,..., X n (i.e. consider as given f X1,...,X n and F X1,...,X n ) Consider k functions g 1 : R n

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Guy Lebanon February 19, 2011 Maximum likelihood estimation is the most popular general purpose method for obtaining estimating a distribution from a finite sample. It was

More information

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models.

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. 1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper McGill University Faculty of Science Department of Mathematics and Statistics Part A Examination Statistics: Theory Paper Date: 10th May 2015 Instructions Time: 1pm-5pm Answer only two questions from Section

More information

Frequentist-Bayesian Model Comparisons: A Simple Example

Frequentist-Bayesian Model Comparisons: A Simple Example Frequentist-Bayesian Model Comparisons: A Simple Example Consider data that consist of a signal y with additive noise: Data vector (N elements): D = y + n The additive noise n has zero mean and diagonal

More information

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

System Identification, Lecture 4

System Identification, Lecture 4 System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2012 F, FRI Uppsala University, Information Technology 30 Januari 2012 SI-2012 K. Pelckmans

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

6. MAXIMUM LIKELIHOOD ESTIMATION

6. MAXIMUM LIKELIHOOD ESTIMATION 6 MAXIMUM LIKELIHOOD ESIMAION [1] Maximum Likelihood Estimator (1) Cases in which θ (unknown parameter) is scalar Notational Clarification: From now on, we denote the true value of θ as θ o hen, view θ

More information

Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics,

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information