What Is Econometrics. Chapter Data Accounting Nonexperimental Data Types

Size: px
Start display at page:

Download "What Is Econometrics. Chapter Data Accounting Nonexperimental Data Types"

Transcription

1 Chapter What Is Econometrics. Data.. Accounting Definition. Accounting ata is routinely recore ata (that is, recors of transactions) as part of market activities. Most of the ata use in econmetrics is accounting ata. Acounting ata has many shortcomings, since it is not collecte specifically for the purposes of the econometrician. An econometrician woul prefer ata collecte in experiments or gathere for her... Nonexperimental The econometrician oes not have any control over nonexperimental ata. This causes various problems. Econometrics is use to eal with or solve these problems...3 Data Types Definition. Time-series ata are ata that represent repeate observations of some variable in subseqent time perios. A time-series variable is often subscripte with the letter t. Definition.3 Cross-sectional ata are ata that represent a set of observations of some variable at one specific instant over several agents. A crosssectional variable is often subscripte with the letter i. Definition.4 Time-series cross-sectional ata are ata that are both timeseries an cross-sectional.

2 CHAPTER. WHAT IS ECONOMETRICS An special case of time-series cross-sectional ata is panel ata.panel ata areobservationsofthesamesetofagentsovertime...4 Empirical Regularities We nee to look at the ata to etect regularities. Often, we use stylize facts, but this can lea to over-simplifications.. Moels Moels are simplifications of the real worl. The ata we use in our moel is what motivates theory... Economic Theory By choosing assumptions, the.. Postulate Relationships Moels can also be evelope by postulating relationships among the variables...3 Equations Economic moels are usually state in terms of one or more equations...4 Error Component Because none of our moels are exactly correct, we inclue an error component into our equations, usually enote u i. In econometrics, we usually assume that the error component is stochastic (that is, ranom). It is important to note that the error component cannot be moele by economic theory. We impose assumptions on u i, an as econometricians, focus on u i..3 Statistics.3. Stochastic Component Some stuff here..3. Statistical Analysis Some stuff here.

3 .4. INTERACTION Tasks There are three main tasks of econometrics:. estimating parameters;. hypothesis testing; 3. forecasting. Forecasting is perhaps the main reason for econometrics..3.4 Econometrics Since our ata is, in general, nonexperimental, econometrics makes use of economic theory to ajust for the lack of proper ata..4 Interaction.4. Scientific Metho Econometrics uses the scientific metho, but the ata are nonexperimental. In some sense, this is similar to astronomers, who gather ata, but cannot conuct experiments (for example, astronomers preict the existence of black holes, but have never mae one in a lab)..4. Role Of Econometrics The three components of econometrics are:. theory;. statistics; 3. ata. These components are interepenent, an each helps shape the others..4.3 Ocam s Razor Often in econometrics, we are face with the problem of choosing one moel over an alternative. The simplest moel that fits the ata is often the best choice.

4 Chapter Some Useful Distributions. Introuction.. Inferences Statistical Statements As statisticians, we are often calle upon to answer questions or make statements concerning certain ranom variables. For example: is a coin fair (i.e. is the probability of heas 0.5) or what is the expecte value of GNP for the quarter. Population Distribution Typically, answering such questions requires knowlege of the istribution of the ranom variable. Unfortunately, we usually o not know this istribution (althoughwemayhaveastrongiea,asinthecaseofthecoin). Experimentation In orer to gain knowlege of the istribution, we raw several realizations of the ranom variable. The notion is that the observations in this sample contain information concerning the population istribution. Inference Definition. The process by which we make statements concerning the population istribution base on the sample observations is calle inference. Example. We ecie whether a coin is fair by tossing it several times an observing whether it seems to be heas about half the time. 4

5 .. INTRODUCTION 5.. Ranom Samples Suppose we raw n observations of a ranom variable, enote {x,x,..., x n } an each x i is inepenent an has the same (marginal) istribution, then {x,x,..., x n } constitute a simple ranom sample. Example. We toss a coin three times. Supposely, the outcomes are inepenent. If x i countsthenumberofheasfortossi,thenwehaveasimple ranom sample. Note that not all samples are simple ranom. Example.3 Weareinteresteintheincomelevelforthepopulationingeneral. The n observations available in this class are not inentical since the higher income iniviuals will ten to be more variable. Example.4 Consier the aggregate consumption level. The n observations available in this set are not inepenent since a high consumption level in one perio is usually followe by a high level in the next...3 Sample Statistics Definition. Any function of the observations in the sample which is the basis for inference is calle a sample statistic. Example.5 In the coin tossing experiment, let S count the total number of heas an P S 3 count the sample proportion of heas. Both S an P are sample statistics...4 Sample Distributions A sample statistic is a ranom variable its value will vary from one experiment to another. As a ranom variable, it is subject to a istribution. Definition.3 The istribution of the sample statistic is the sample istribution of the statistic. Example.6 The statistic S introuce above has a multinomial sample istribution. Specifically Pr(S 0) /8, Pr(S ) 3/8, Pr(S ) 3/8, an Pr(S 3) /8.

6 6 CHAPTER. SOME USEFUL DISTRIBUTIONS. Normality An The Sample Mean.. Sample Sum Consier the simple ranom sample {x,x,..., x n },wherex measures the height of an ault female. We will assume that E x i µ an Var(x i )σ, for all i,,..., n Let S x + x + + x n enote the sample sum. Now, Also, E S E( x + x + + x n )E x + E x + + E x n nµ (.) Var(S) E( S E S ) E( S nµ ) E( x + x + + x n nµ ) " n # X E ( x i µ ) E[( x µ ) +(x µ )( x µ )+ + ( x µ )( x µ )+(x µ ) + +(x n µ ) ] nσ. (.) Note that E[( x i µ )( x j µ )] 0 by inepenence... Sample Mean Let x S n enote the sample mean or average...3 Moments Of The Sample Mean The mean of the sample mean is Thevarianceofthesamplemeanis E x E S n n E S nµ µ. (.3) n Var(x) E(x µ) E( S n µ ) E[ ( S nµ )] n n E( S nµ )

7 .3. THE NORMAL DISTRIBUTION 7 n nσ..4 Sampling Distribution σ n. (.4) We have been able to establish the mean an variance of the sample mean. However, in orer to know its complete istribution precisely, we must know the probability ensity function (pf) of the ranom variable x..3 The Normal Distribution.3. Density Function Definition.4 A continuous ranom variable x i with the ensity function f ( x i ) πσ e σ ( x i µ ) (.5) follows the normal istribution, where µ an σ are the mean an variance of x i, respectively. Since the istribution is characterize by the two parameters µ an σ,we enote a normal ranom variable by x i N ( µ, σ ). The normal ensity function is the familiar bell-shape curve, as is shown in Figure. for µ 0 an σ. It is symmetric about the mean µ. Approximately 3 of the probability mass lies within ±σ of µ an about.95 lies within ±σ. There are numerous examples of ranom variables that have this shape. Many economic variables are assume to be normally istribute..3. Linear Transformation Consier the transforme ranom variable We know that an Y i a + bx i µ Y E Y i a + bµ x σ Y E(Y i µ Y ) b σ x If x i is normally istribute, then Y i is normally istribute as well. That is, Y i N ( µ Y,σ Y )

8 8 CHAPTER. SOME USEFUL DISTRIBUTIONS Figure.: The Stanar Normal Distribution Moreover, if x i N ( µ x,σ x )anz i N ( µ z,σ z ) are inepenent, then Y i a + bx i + cz i N ( a + bµ x + cµ z,b σ x + c σ z ) These results will be formally emonstrate in a more general setting in the next chapter..3.3 Distribution Of The Sample Mean If, for each i,,...,n,thex i s are inepenent, ientically istribute (ii) normal ranom variables, then.3.4 The Stanar Normal x i N ( µ x, σ x n ) (.6) The istribution of x will vary with ifferent values of µ x an σ x,whichis inconvenient. Rather than ealing with a unique istribution for each case, we

9 .4. THE CENTRAL LIMIT THEOREM 9 perform the following transformation: Now, Z x µ x q σx n x q µ q x σx n σx n E Z q E x σx n µ q x σx n 0. µ q x σx n µ q x σx n Also, Var( Z ) E q x µ q x σx σx n n n E ( x µ x ) σ x n σx. σ x n Thus Z N(0, ). The N(0, ) istribution is the stanar normal an is welltabulate. The probability ensity function for the stanar normal istribution is f ( z i ) π e ( z i ) ϕ(z i ) (.8).4 The Central Limit Theorem.4. Normal Theory The normal ensity has a prominent position in statistics. This is not only because many ranom variables appear to be normal, but also because most any sample mean appears normal as the sample size increases. Specifically, suppose x,x,...,x n is a simple ranom sample an E x i µ x an Var( x i )σx,thenasn, the istribution of x becomes normal. That

10 0 CHAPTER. SOME USEFUL DISTRIBUTIONS is, lim n f x µ x q σ x n ϕ x µ x q (.9) σx n.5 Distributions Associate With The Normal Distribution.5. The Chi-Square Distribution Definition.5 Suppose that Z,Z,...,Z n is a simple ranom sample, an Z i N(0, ). Then nx Zi Xn, (.0) where n are the egrees of freeom of the Chi-square istribution. The probability ensity function for the X n is f χ ( x ) n/ Γ( n/) xn/ e x/,x>0 (.) where Γ(x) is the gamma function. See Figure.. If x,x,...,x n is a simple ranom sample, an x i N ( µ x,σ x ), then nx µ xi µ σ X n. (.) The chi-square istribution will prove useful in testing hypotheses on both the variance of a single variable an the (conitional) means of several. This multivariate usage will be explore in the next chapter. Example.7 Consier the estimate of σ P n s ( x i x ). n Then ( n ) s σ X n. (.3)

11 .5. DISTRIBUTIONS ASSOCIATED WITH THE NORMAL DISTRIBUTION Figure.: Some Chi-Square Distributions.5. The t Distribution Definition.6 Suppose that Z N(0, ), Y Xk, an that Z an Y are inepenent. Then Z q t k, (.4) Y k where k are the egrees of freeom of the t istribution. The probability ensity function for a t ranom variable with n egrees of freeom is Γ n+ f t ( x ) nπ Γ n + x (n+)/, (.5) n for <x<. See Figure.3 The t (also known as Stuent s t) istribution, is name after W.S. Gosset, who publishe uner the pseuonym Stuent. It is useful in testing hypotheses concerning the (conitional) mean when the variance is estimate. Example.8 Consier the sample mean from a simple ranom sample of normals. We know that x N( µ, σ /n )an Z x q µ N(0, ). σ n

12 CHAPTER. SOME USEFUL DISTRIBUTIONS Figure.3: Some t Distributions Also, we know that Y (n ) s σ X n, where s is the unbiase estimator of σ. Thus, if Z an Y are inepenent (which, in fact, is the case), then Z q Y ( n ) x µ σ n r ( n ) s σ ( n ) s n σ (x µ ) x q µ s n s σ t n (.6)

13 .5. DISTRIBUTIONS ASSOCIATED WITH THE NORMAL DISTRIBUTION3.5.3 The F Distribution Definition.7 Suppose that Y Xm, W Xn,anthatY an W are inepenent. Then Y F m,n, (.7) where m,n are the egrees of freeom of the F istribution. m W n The probability ensity function for a F ranom variable with m an n egrees of freeom is f F ( x ) Γ m+n (m/n) m/ Γ n Γ m x (m/) ( + mx/n) (m+n)/ (.8) The F istribution is name after the great statistician Sir Ronal A. Fisher, an is use in many applications, most notably in the analysis of variance. This situation will arise when we seek to test multiple (conitional) mean parameters with estimate variance. Note that when x t n then x F,n. Some examples of the F istribution can be seen in Figure.4. Figure.4: Some F Distributions

14 Chapter 3 Multivariate Distributions 3. Matrix Algebra Of Expectations 3.. Moments of Ranom Vectors Let x x. x m x be an m vector-value ranom variable. Each element of the vector is a scalar ranom variable of the type iscusse in the previous chapter. The expectation of a ranom vector is E[ x ] µ E[ x ] E[ x ]. µ. µ. (3.) E[ x m ] µ m Note that µ is also an m column vector. We see that the mean of the vector isthevectorofthemeans. Next, we evaluate the following: E[( x µ )( x µ ) 0 ] E ( x µ ) ( x µ )( x µ ) ( x µ )( x m µ m ) ( x µ )( x µ ) (x µ ) ( x µ )( x m µ m ) ( x m µ m )( x µ ) (x m µ m )( x µ ) ( x m µ m ) 4

15 3.. MATRIX ALGEBRA OF EXPECTATIONS 5 σ σ σ m σ σ σ m σ m σ m σ mm Σ. (3.) Σ, the covariance matrix, is an m m matrix of variance an covariance terms. The variance σ i σ ii of x i is along the iagonal, while the cross-prouct terms represent the covariance between x i an x j. 3.. Properties Of The Covariance Matrix Symmetric The variance-covariance matrix Σ is a symmetric matrix. This can be shown by noting that σ ij E( x i µ i )( x j µ j )E( x j µ j )( x i µ i )σ ji. Due to this symmetry Σ will only have m(m +)/ unique elements. Positive Semiefinite Σ is a positive semiefinite matrix. Recall that any m m matrix is positive semiefinite if an only if it meets any of the following three equivalent conitions:. All the principle minors are nonnegative;. λ 0 Σλ 0, for all {z} λ 6 0; m 3. Σ PP 0,forsome {z} P. m m The first conition (actually we use negative efiniteness) is useful in the stuy of utility maximization while the latter two are useful in econometric analysis. The secon conition is the easiest to emonstrate in the current context. Let λ 6 0. Then, we have λ 0 Σλ λ 0 E[( x µ )( x µ ) 0 ]λ E[λ 0 ( x µ )( x µ ) 0 λ ] E{ [ λ 0 ( x µ )] } 0,

16 6 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS since the term insie the expectation is a quaratic. Hence, Σ is a positive semiefinite matrix. Note that P satisfying the thir relationship is not unique. Let D be any m m orthonormal martix, then DD 0 I m an P PD yiels P P 0 PDD 0 P 0 PI m P 0 Σ. Usually, we will choose P to be an upper or lower triangular matrix with m(m + )/ nonzero elements. Positive Definite Since Σ is a positive semiefinite matrix, it will be a positive efinite matrix if an only if et(σ) 6 0. Now, we know that Σ PP 0 for some m m matrix P. Thisimpliesthatet(P) Linear Transformations Let y {z} m b {z} m m z} { + {z} B x.then m m E[ y ] b + B E[ x ] b + Bµ µ y (3.3) Thus, the mean of a linear transformation is the linear transformation of the mean. Next, we have E[(y µ y )( y µ y ) 0 ] E{[ B ( x µ )][ (B ( x µ )) 0 ]} B E[( x µ )( x µ) 0 ]B 0 BΣB 0 BΣB 0 (3.4) Σ y (3.5) whereweusetheresult(abc ) 0 C 0 B 0 A 0, if conformability hols. 3. Change Of Variables 3.. Univariate Let x be a ranom variable an f x ( ) be the probability ensity function of x. Now, efine y h(x), where h 0 ( x ) h( x ) x > 0.

17 3.. CHANGE OF VARIABLES 7 That is, h( x ) is a strictly monotonically increasing function an so y is a oneto-one transformation of x. Now, we woul like to know the probability ensity function of y, f y (y). To finit,wenotethat an, Pr( y h( a ))Pr(x a ), (3.6) Pr( x a ) Pr( y h( a )) Z a Z h( a ) f x ( x ) x F x ( a ), (3.7) f y ( y ) y F y ( h( a )), (3.8) for all a. Assuming that the cumulative ensity function is ifferentiable, we use (3.6) to combine (3.7) an (3.8), an take the total ifferential, which gives us F x ( a ) F y ( h( a )) f x ( a )a f y ( h( a ))h 0 ( a )a for all a. Thus, for a small perturbation, f x ( a )f y ( h( a ))h 0 ( a ) (3.9) for all a. Also, sincey is a one-to-one transformation of x, weknowthath( ) can be inverte. That is, x h (y). Thus, a h (y), an we can rewrite (3.9) as f x ( h ( y )) f y ( y )h 0 ( h ( y )). Therefore, the probability ensity function of y is f y ( y )f x ( h ( y )) h 0 ( h ( y )). (3.0) Note that f y ( y ) has the properties of being nonnegative, since h 0 ( ) > 0. If h 0 ( ) < 0, (3.0) can be correcte by taking the absolute value of h 0 ( ), which will assure that we have only positive values for our probability ensity function. 3.. Geometric Interpretation Consier the graph of the relationship shown in Figure 3.. We know that Pr[ h( b ) >y>h( a )] Pr( b>x>a). Also, we know that Pr[ h( b ) >y>h( a )] ' f y [ h( b )][ h( b ) h( a )],

18 8 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS y h(b) h(a) h(x) a b x Figure 3.: Change of Variables an So, Pr( b>x>a) ' f x ( b )( b a ). f y [ h( b )][ h( b ) h( a )] ' f x ( b )( b a ) f y [ h( b )] ' f x ( b ) [ h( b ) h( a )]/( b a )] (3.) Now, as we let a b, the enominator of (3.) approaches h 0 ( ). This is then the same formula as (3.0) Multivariate Let {z} x f x ( x ). m Define a one-to-one transformation m z} { y h ( x ). {z} m Since h( ) is a one-to-one transformation, it has an inverse: x h ( y ).

19 3.. CHANGE OF VARIABLES 9 We also assume that h( x ) x 0 h( x ) x 0 exists. This is the m m Jacobian matrix, where h ( x ) h ( x ). h m ( x ) (x x x m ) h ( x ) h ( x ) x x h ( x ) h ( x ) x x h ( x ) h ( x ) x m x m h m( x ) x h m ( x ) x h m( x ) x m J x ( x ) (3.) Given this notation, the multivariate analog to (3.) can be shown to be f y ( y )f x [ h ( y )] et(j x [ h ( y )]) (3.3) Since h( ) isifferentiable an one-to-one then et(j x [ h ( y )]) 6 0. Example 3. Let y b 0 + b x,wherex, b 0,anb are scalars. Then x y b 0 b an Therefore, y x b. f y ( y )f x µ y b0 b b. Example 3. Let y b + Bx, wherey is an m vector an et(b) 6 0. Then x B ( y b ) an Thus, y x 0 B J x( x ). f y ( y )f x B ( y b ) et( B ).

20 0 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS 3.3 Multivariate Normal Distribution 3.3. Spherical Normal Distribution Definition 3. An m ranomvectorz is sai to be spherically normally istribute if f(z) (π) n/ e z 0z. Such a ranom vector can be seen to be a vector of inepenent stanar normals. Let z,z,...,z m, be i.i.. ranom variables such that z i N(0, ). That is, z i has pf given in (.8), for i,..., m. Then, by inepenence, the joint istribution of the z i s is given by where z 0 (z z... z m ). f( z,z,...,z m ) f( z )f( z ) f( z m ) ny e z i π 3.3. Multivariate Normal Definition 3. The m ranom vector x with ensity (π) m/ e n z i (π) m/ e z 0z, (3.4) f x ( x ) (π) n/ [et(σ)] / e ( x µ ) 0 Σ ( x µ ) (3.5) is sai to be istribute multivariate normal with mean vector µ an positive efinite covariance matrix Σ. Such a istribution for x is enote by x N(µ, Σ). The spherical normal istribution is seen to be a special case where µ 0 an Σ I m. There is a one-to-one relationship between the multivariate normal ranom vector an a spherical normal ranom vector. Let z be an m spherical normal ranom vector an {z} x µ + Az, m where z is efine above, an et(a) 6 0. Then, E x µ + A E z µ, (3.6)

21 3.3. MULTIVARIATE NORMAL DISTRIBUTION since E[z] 0. Also, we know that E( zz 0 )E z z z z z m z z z z z m z m z z m z zm I m, (3.7) since E[z i z j ]0,foralli 6 j, ane[zi ],foralli. Therefore, E[( x µ )( x µ ) 0 ] E( Azz 0 A 0 ) A E( zz 0 )A 0 AI m A 0 Σ, (3.8) where Σ is a positive efinite matrix (since et(a) 6 0). Next, we nee to fin the probability ensity function f x ( x )ofx. Weknow that z A ( x µ ), z 0 (x µ ) 0 A 0, an J z ( z )A, so we use (3.3) to get f x ( x ) f z [ A ( x µ )] et( A ) (π) m/ e ( x µ ) (π ) m/ e ( x µ ) 0 A 0 A ( x µ ) et( A ) 0 ( AA 0 ) ( x µ ) et( A ) (3.9) whereweusetheresults(abc) C B A an A 0 A 0. However, Σ AA 0,soet(Σ) et(a) et(a), an et(a) [et(σ)] /. Thus we can rewrite (3.9) as f x ( x ) (π) m/ [et(σ)] / e ( x µ ) 0 Σ ( x µ ) (3.0) an we see that x N(µ, Σ) withmeanvectorµ an covariance matrix Σ. Since this process is completely reversable the relationship is one-to-one.

22 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS Linear Transformations Theorem 3. Suppose x N(µ, Σ) with et(σ) 6 0an y b + Bx with B square an et(b) 6 0.Theny N(µ y, Σ y ). Proof: From (3.3) an (3.4), we have E y b + Bµ µ y an E[( y µ y )( y µ y ) 0 ]BΣB 0 Σ y. To fin the probability ensity function f y ( y )ofy, we again use (3.3), which gives us f y ( y ) f x [ B ( y b )] et( B ) (π) m/ [et(σ)] / e [ B ( y b ) µ ] 0 Σ [ B ( y b ) µ ] et( B ) (π) m/ [et(bσb 0 )] / e ( y b Bµ ) ( BΣB 0 ) ( y b Bµ ) (π) m/ [et(σ y )] / e ( y µ y ) Σ y ( y µ y ) (3.) So, y N(µ y, Σ y ). Thus we see, as asserte in the previous chapter, that linear transformations of multivariate normal ranom variables are also multivariate normal ranom variables. An any linear combination of inepenent normals will also be normal Quaratic Forms Theorem 3. Let x N( µ, Σ ), whereet(σ) 6 0, then( x µ ) 0 Σ ( x µ ) X m. Proof Let Σ PP 0.Then ( x µ ) N(0, Σ ), an Therefore, z P ( x µ ) N(0, I m ). nx z 0 z zi Xm P ( x µ ) 0 P ( x µ ) (x µ ) 0 P 0 P ( x µ ) (x µ ) 0 Σ ( x µ ) Xm. (3.) Σ is calle the weight matrix. With this result, we can use the X m to make inferences about the mean µ of x.

23 3.4. NORMALITY AND THE SAMPLE MEAN Normality an the Sample Mean 3.4. Moments of the Sample Mean Consier the m vectorx i i.i.. jointly, with m vector mean E[x i ]µ an m m covariance matrix E[(x i µ)(x i µ) 0 ]Σ. Define x n P n n x i as the vector sample mean which is also the vector of scalar sample means. The mean of the vector sample mean follows irectly: E[x n ] nx E[x i ]µ. n Alternatively, this result can be obtaine by applying the scalar results element by element. The secon moment matrix of the vector sample mean is given by E[(x n µ)(x n µ) 0 ] h n E ( P n x i nµ)( P n x i nµ) 0i n E[{(x µ)+(x µ)+... +(x n µ)} n nσ n Σ {(x µ)+(x µ)+... +(x n µ)} 0 ] since the covariances between ifferent observations are zero Distribution of the Sample Mean Suppose x i i.i.. N(µ, Σ) jointly. Then it follows from joint multivariate normality that x n must also be multivariate normal since it is a linear transformation. Specifically, we have or equivalently x n N(µ, n Σ) x n µ N(0, n Σ) n(xn µ) N(0, Σ) nσ / (x n µ) N(0, I m ) where Σ Σ / Σ /0 an Σ / (Σ / ) Multivariate Central Limit Theorem Theorem 3.3 Suppose that (i) x i i.i. jointly, (ii) E[x i ] µ, an (iii) E[(x i µ)(x i µ) 0 ]Σ, then n(xn µ) N(0, Σ)

24 4 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS or equivalently z nσ / (x n µ) N(0, I m ). These results apply even if the original unerlying istribution is not normal an follow irectly from the scalar results applie to any linear combination of x n Limiting Behavior of Quaratic Forms Consier the following quaratic form n (x n µ) 0 Σ (x n µ) n (x n µ) 0 (Σ / Σ /0 ) (x n µ) [n (x n µ) 0 Σ /0 Σ / (x n µ) [ nσ / (x n µ)] 0 [ nσ / (x n µ)] z 0 z χ m. This form is convenient for asymptotic joint test concerning more than one mean at a time. 3.5 Noncentral Distributions 3.5. Noncentral Scalar Normal Definition 3.3 Let x N(µ, σ ). Then, z x σ N( µ/σ, ) (3.3) has a noncentral normal istribution. Example 3.3 When we o a hypothesis test of mean, with known variance, we have, uner the null hypothesis H 0 : µ µ 0, x µ 0 N(0, ) (3.4) σ an, uner the alternative H : µ µ 6 µ 0, x µ 0 σ x µ σ + µ µ 0 σ N(0, )+ µ µ 0 σ µ µ µ 0 N σ,. (3.5) uner the alternative hypothesis follows a non- Thus, the behavior of x µ 0 σ central normal istribution. As this example makes clear, the noncentral normal istribution is especially useful when carefully exploring the behavior of the alternative hypothesis.

25 3.5. NONCENTRAL DISTRIBUTIONS Noncentral t Definition 3.4 Let z N ( µ/σ, ), w χ k,anletz an w be inepenent. Then z p w/k t k ( µ ) (3.6) has a noncentral t istribution. The noncentral t istribution is use in tests of the mean, when the variance is unknown Noncentral Chi-Square Definition 3.5 Let z N (µ, I m ). Then z 0 z Xm( δ ), (3.7) has a noncentral chi-aquare istribution, where δ µ 0 µ is the noncentrality parameter. In the noncentral chi-square istribution, the probability mass is shifte to the right as compare to a regular chi-square istribution. Example 3.4 When we o a test of µ, withknownσ, wehave H 0 : µ µ 0 ( x µ 0 ) 0 Σ ( x µ 0 ) X m (3.8) Let z P (x µ 0 ). Then, we have H : µ µ 6 µ 0 ( x µ 0 ) 0 Σ ( x µ 0 ) (x µ 0 ) 0 P 0 P ( x µ 0 ) z 0 z X m[( µ µ 0 ) 0 Σ ( µ µ 0 )] (3.9) Noncentral F Definition 3.6 Let Y χ m(δ), W χ n, an let Y an W be inepenent ranom variables. Then Y/m W/n F m,n( δ ), (3.30) has a noncentral F istribution, where δ is the noncentrality parameter. The noncentral F istribution is use in tests of mean vectors, where the variance-covariance matrix is unknown an must be estimate.

26 Chapter 4 Asymptotic Theory 4. Convergence Of Ranom Variables 4.. Limits An Orers Of Sequences Definition 4. A sequence of real numbers a,a,...,a n,...,issaitohave a limit of α if for every δ>0, there exists a positive real number N such that for all n>n, a n α <δ. This is enote as lim a n α. n Definition 4. A sequence of real numbers { a n } is sai to be of at most orer n k,anwewrite{ a n } is O( n k ), if where c is any real constant. lim n n k a n c, Example 4. Let { a n } 3+/n, an{ b n } 4 n. Then, { a n } is O( ) O(n 0 ),since lim n n a n 3, an { b n } is O( n ), since lim n n b n-. Definition 4.3 A sequence of real numbers { a n } is sai to be of orer smaller than n k,anwewrite{ a n } is o( n k ), if lim n n k a n 0. 6

27 4.. CONVERGENCE OF RANDOM VARIABLES 7 Example 4. Let { a n } /n. Then, { a n } is o( ), since lim n n 0 a n Convergence In Probability Definition 4.4 A sequence of ranom variables y,y,...,y n,..., with istribution functions F ( ),F ( ),...,F n ( ),...,issaitoconverge weakly in probability to some constant c if lim n c > ]0. n (4.) for every real number >0. Weak convergence in probability is enote by or sometimes, plim y n c, (4.) n y n p c, (4.3) or y n p c. This efinition is equivalent to saying that we have a sequence of tail probabilities (of being greater than c+ or less than c ), an that the tail probabilities approach 0 as n, regarless of how small is chosen. Equivalently, the probability mass of the istribution of y n is collapsing about the point c. Definition 4.5 A sequence of ranom variables y,y,...,y n,...,issaito converge strongly in probability to some constant cif for any real >0. or lim Pr[ sup y n c > ]0, (4.4) N n>n Strong convergence is also calle almost sure convergence an is enote y n a.s. c, (4.5) y n a.s. c. Notice that if a sequence of ranom variables converges strongly in probability, it converges weakly in probability. The ifference between the two is that almost

28 8 CHAPTER 4. ASYMPTOTIC THEORY sure convergence involves an element of uniformity that weak convergence oes not. A sequence that is weakly convergent can have Pr[ y n c > ] wigglewaggle above an below the constant δ use in the limit an then settle own to subsequently be smaller an meet the conition. For strong convergence, once the probability falls below δ for a particular N in the sequence it will subsequently be smaller for all larger N. Definition 4.6 A sequence of ranom variables y,y,...,y n,...,issaito converge in quaratic mean if lim n E[ y n ]c an lim n Var[ y n ]0. By Chebyshev s inequality, convergence in quaratic mean implies weak convergence in probability. For a ranom variable x with mean µ an variance σ, Chebyshev s inequality states Pr( x µ kσ) k. Let σ n enote thevarianceony n, then we can write the conition for the present case as Pr( y n E[y n ] kσ n ) k. Since σ n 0anE[y n ] c the probability will be less than k for sufficiently large n for any choice of k. But this is just weak convergence in probability to c Orers In Probability Definition 4.7 Let y,y,...,y n,... be a sequence of ranom variables. This sequence is sai to be boune in probability if for any >δ>0, there exist a < an some N sufficiently large such that for all n>n. Pr( y n > ) <δ, These conitions require that the tail behavior of the istributions of the sequence not be pathological. Specifically, the tail mass of the istributions cannot be rifting away from zero as we move out in the sequence. Definition 4.8 Thesequenceofranomvariables{ y n } is sai to be at most of orer in probability n λ, an is enote O p ( n λ ), if n λ y n is boune in probability. Example 4.3 Suppose z N(0, ) an y n 3+n z, thenn y n 3/n + z is a boune ranom variable since the first term is asymptotically negligible an we see that y n O p ( n ).

29 4.. CONVERGENCE OF RANDOM VARIABLES 9 Definition 4.9 The sequence of ranom variables { y n } is sai to be of orer in probability smaller than n λ, an is enote o p ( n λ ), if n λ p y n 0. Example 4.4 Convergence in probability can be represente in terms of orer p p in probability. Suppose that y n c or equivalently y n c 0, then n 0 p (y n c) 0any n c o p () Convergence In Distribution Definition 4.0 A sequence of ranom variables y,y,...,y n,...,withcumulative istribution functions F ( ),F ( ),...,F n ( ),...,issaitoconverge in istribution to a ranom variable y with a cumulative istribution function F ( y ), if lim F n( ) F ( ), (4.6) n for every point of continuity of F ( ). The istribution F ( ) issaitobethe limiting istribution of this sequence of ranom variables. For notational convenience, we often write y n F ( ) ory n F ( ) if a sequence of ranom variables converges in istribution to F ( ). Note that the moments of elements of the sequence o not necessarily converge to the moments of the limiting istribution Some Useful Propositions In the following propositions, let x n an y n be sequences of ranom vectors. Proposition 4. If x n y n converges in probability to zero, an y n has a limiting istribution, then x n has a limiting istribution, which is the same. Proposition 4. If y n has a limiting istribution an plim x n 0, then for n z n y 0 n x n, plim z n 0. n Proposition 4.3 Suppose that y n converges in istribution to a ranom variable y, anplim x n c. Thenx 0 n y n converges in istribution to c 0 y. n Proposition 4.4 If g( ) is a continuous function, an if x n y n converges in probability to zero, then g( x n ) g( y n ) converges in probability to zero. Proposition 4.5 If g( ) is a continuous function, an if x n converges in probability to a constant c, thenz n g( x n ) converges in istribution to the constant g( c ).

30 30 CHAPTER 4. ASYMPTOTIC THEORY Proposition 4.6 If g( ) is a continuous function, an if x n converges in istribution to a ranom variable x, thenz n g( x n ) converges in istribution to a ranom variable g( x ). 4. Estimation Theory 4.. Properties Of Estimators Definition 4. An estimator b θ n of the p parameter vector θ is a function of the sample observations x,x,..., x n. It follows that b θ, b θ,..., b θ n form a sequence of ranom variables. Definition 4. The estimator b θ n is sai to be unbiase if E b θ n θ, for all n. Definition 4.3 The estimator b θ n is sai to be asympotically unbiase if lim n Eb θ n θ. Note that an estimator can be biase in finite samples, but asymptotically unbiase. Definition 4.4 The estimator θ b n is sai to be consistent if plimbθ n θ. n Consistency neither implies nor is implie by asymptotic unbiaseness, as emonstrate by the following examples. Example 4.5 Let eθ n ½ θ, with probability /n θ + nc, with probability /n We have E e θ n θ + c, so e θ n is a biase estimator, an lim n E e θ n θ + c, so eθ n is asymptotically biase as well. However, lim n Pr( e θ n θ > )0,so eθ n is a consistent estimator. Example 4.6 Suppose x i i.i.. N(µ, σ )fori,,..., n,... an let ex n x n be an estimator of µ. Now E[ex n ]µ so the estimator is unbiase but Pr( ex n µ >.96σ).05 so the probability mass is not collapsing about the target point µ so the estimator is not consistent.

31 4.. ESTIMATION THEORY Laws Of Large Numbers An Central Limit Theorems Most of the large sample properties of he estimators consiere in the sequel erive from the fact the estimators involve sample averages an the asymptotic behavior of averages is well stuies. In aition to the central limit theorems presente in the previous two chapters we have the following two laws of large numbers: Theorem 4. If x,x,...,x n is a simple ranom sample, that is, the x i s are i.i.., ane x i µ an Var( x i )σ, then by Chebyshev s Inequality, plim x n µ (4.7) n Theorem 4. (Khitchine) Suppose that x,x,...,x n are i.i.. ranom variables, such that for all i,...,n, E x i µ, then, plim x n µ (4.8) n Both of these results apply element-by-element to vectors of estimators. sake of completeness we repeat the following scalar central linit theorem. For Theorem 4.3 (Linberg-Levy) Suppose that x,x,...,x n are i.i.. ranom variables, such that for all i,...,n, E x i µ an Var( x i )σ,then n( xn µ ) N( 0,σ ),or lim n f( x n µ ) πσ e σ ( x n µ ) N(0,σ ) (4.9) This result is easily generalize to obtain the multivariate version given in Theorem 3.3. Theorem 4.4 (Multivariate CLT) Suppose that m ranom vectors x, x,...,x n are (i) jointly i.i.., (ii) E x i µ, an (iii) Cov( x i )Σ, then n( xn µ ) 4..3 CUAN An Efficiency N( 0, Σ ) (4.0) Definition 4.5 An estimator is sai to be consistently uniformly asymptotically normal (CUAN) if it is consistent, an if n( b θ n θ ) converges in istribution to N( 0, Ψ ), an if the convergence is uniform over some compact subset of the parameter space.

32 3 CHAPTER 4. ASYMPTOTIC THEORY Suppose that n( θ b n θ ) converges in istribution to N( 0, Ψ ). Let θ e n be an alternative estimator such that n( θ e n θ ) converges in istribution to N( 0, Ω ). Definition 4.6 If b θ n is CUAN with asymptotic covariance Ψ an e θ n is CUAN with asymptotic covariance Ω, then b θ n is asymptotically efficient relative to e θ n if Ψ Ω is a positive semiefinite matrix. Among other properties asymptotic relative efficiency implies that the iagonal elements of Ψ are no larger than those of Ω, so the asymptotic variances of θ b n,i are no larger than those of θ e n,i. An a similar result applies for the asymptotic variance of any linear combination. Definition 4.7 ACUANestimator b θ n is sai to be asymptotically efficient if it is asymptotically efficient relative to any other CUAN estimator. 4.3 Asymptotic Inference 4.3. Normal Ratios Now, uner the conitions of the central limit theorem, n( xn µ ) N( 0,σ ), so ( x n µ ) p σ /n N( 0, ) (4.) Suppose that c σ is a consistent estimator of σ.thenwealsohave ( x n µ ) p bσ /n p σ /n ( x n µ ) p σ /n p bσ /n r σ ( x n µ ) p bσ σ /n N( 0, ) (4.) since the term uner the square root converges in probability to one an the remainer converges in istribution to N( 0, ). Most typically, such ratios will be use for inference in testing a hypothesis. Now, for H 0 : µ µ 0,wehave ( x n µ 0 ) p bσ /n while uner H : µ µ 6 µ 0,wefin that ( x n µ 0 ) p bσ /n N( 0, ), (4.3) n( xn µ ) bσ + n( µ µ 0 ) bσ N(0, )+O p ( n ) (4.4)

33 4.3. ASYMPTOTIC INFERENCE 33 Thus, extreme values of the ratio can be taken as rare events uner the null or typical events uner the alternative. Such ratios are of interest in estimation an inference with regar to more general parameters. Suppose that θ is the parameter vector n( b θ p θ ) N( 0, Ψ ). p p Then, if θ i is the parameter of particular interest we consier ( b θ i θ i ) p ψii /n an uplicating the arguments for the sample mean ( b θ i θ i ) q bψii/n N( 0, ), (4.5) N( 0, ), (4.6) where Ψ b is a consistent estimator of Ψ. This ratio will have a similar behavior uner a null an alternative hypotesis with regar to θ i Asymptotic Chi-Square Suppose that n( θ b θ ) N( 0, Ψ ), where Ψ b is a consistent estimator of the nonsingular p p matrix Ψ. Then, from the previous chapter we have an n( b θ θ ) 0 Ψ n( b θ θ )n ( b θ θ ) 0 Ψ ( b θ θ ) n ( b θ θ ) 0 b Ψ ( b θ θ ) χ p (4.7) χ p (4.8) for Ψ b a consistent estimator of Ψ. This result can be use to conuct infencebytestingtheentireparmater vector. If H 0 : θ θ 0,then n ( b θ θ 0 ) 0 b Ψ ( b θ θ 0 ) an large positive values are rare events. show (later) χ p, (4.9) while for H : θ θ 6 θ 0,wecan n ( θ b θ 0 ) 0 Ψ b ( θ b θ 0 ) n (( θ b θ )+(θ θ 0 )) 0 Ψ (( θ b θ )+(θ θ 0 )) n ( θ b θ ) 0 Ψ b ( θ b θ )+n ( θ θ 0 ) 0 Ψ ( θ b θ ) +n ( θ θ 0 ) 0 Ψ b ( θ θ 0 ) χ p +O p ( n)+o p (n) (4.0)

34 34 CHAPTER 4. ASYMPTOTIC THEORY Thus,ifweobtainalargevalueofthestatistic,wemaytakeitaseviencethat the null hypothesis is incorrect. This result can also be applie to any subvector of θ. Let µ θ θ, θ where θ is a p vector. Then n( θ b θ ) N( 0, Ψ ), (4.) where Ψ is the upper left-han q q submatrix of Ψ an, n ( θ b θ ) 0 Ψ b ( θ b θ ) χ p (4.) Tests Of General Restrictions We can use similar results to test general nonlinear restrictions. r( ) isaq continuously ifferentiable function, an Suppose that n( b θ θ ) N( 0, Ψ ). By the intermeiate value theorem we can obtain the exact Taylor s series representation r( θ)r(θ)+ b r(θ ) θ 0 ( θ b θ ) or equivalently n(r( θ) b r(θ ) r(θ)) n( θ b θ ) θ 0 R( θ ) n( θ b θ ) where R( θ ) r(θ ) θ 0 an θ lies between b θ an θ. Now b θ p θ so θ p θ an R( θ ) R( θ ). Thus, we have n[ r( b θ ) r( θ )] N( 0, R( θ )ΨR 0 ( θ )). (4.3) Thus, uner H 0 : r( θ ) 0, assuming R( θ )ΨR 0 ( θ ) is nonsingular, we have n r( b θ ) 0 [R( θ )ΨR 0 ( θ )] r( b θ ) χ q, (4.4) where q is the length of r( ). In practice, we substitute the consistent estimates R( b θ )forr( θ )anbψ for Ψ to obtain, following the arguments given above n r( b θ ) 0 [R( b θ ) b ΨR 0 ( b θ )] r( b θ ) χ q, (4.5) The behavior uner the alternative hypothesis will be O p (n) asabove.

35 Chapter 5 Maximum Likelihoo Methos 5. Maximum Likelihoo Estimation (MLE) 5.. Motivation Suppose we have a moel for the ranom variable y i,fori,,...,n,with unknown (p ) parameter vector θ. In many cases, the moel will imply a istribution f( y i, θ ) for each realization of the variable y i. A basic premise of statistical inference is to avoi unlikely or rare moels, forexample,inhypothesistesting. Ifwehavearealizationofastatisticthat excees the critical value then it is a rare event uner the null hypothesis. Uner the alternative hypothesis, however, such a realization is much more likely to occur an we reject the null in favor of the alternative. Thus in choosing between the null an alternative, we select the moel that makes the realization of the statistic more likely to have occure. Carrying this iea over to estimation, we select values of θ such that the corresponing values of f( y i, θ ) are not unlikely. After all, we o not want a moel that isagrees strongly with the ata. Maximum likelihoo estimation is merely a formalization of this notion that the moel chosen shoul not be unlikely. Specifically, we choose the values of the parameters that make the realize ata most likely to have occure. This approach oes, however, require that the moel be specifie in enough etail to imply a istribution for the variable of interest. 35

36 36 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS 5.. The Likelihoo Function Suppose that the ranom variables y,y,...,y n are i.i.. Then, the joint ensity function for n realizations is f( y,y,...,y n θ ) f( y θ ) f( y, θ )... f( y n θ ) ny f( y i θ ) (5.) Given values of the parameter vector θ, this function allows us to assign local probability measures for various choices of the ranom variables y,y,...,y n. This is the function which must be integrate to make probability statements concerning the joint outcomes of y,y,...,y n. Given a set of realize values of the ranom variables, we use this same function to establish the probability measure associate with various choices of the parameter vector θ. Definition 5. The likelihoo function of the parameters, for a particular sample of y,y,...,y n, is the joint ensity function consiere as a function of θ given the y i s. That is, L( θ y,y,...,y n ) ny f( y i θ ) (5.) 5..3 Maximum Likelihoo Estimation For a particular choice of the parameter vector θ, the likelihoo function gives a probability measure for the realizations that occure. Consistent with the approach use in hypothesis testing, an using this function as the metric, we choose θ that make the realizations most likely to have occure. Definition 5. The maximum likelihoo estimator of θ is the estimator obtaine by maximizing L( θ y,y,...,y n ) with respect to θ. Thatis, where b θ is calle the MLE of θ. max L( θ y,y,...,y n ) θ, b (5.3) θ Equivalently, since log( ) is a strictly monotonic transformation, we may fin the MLE of θ by solving max θ L( θ y,y,...,y n ), (5.4) where L( θ y,y,...,y n )logl(θ y,y,...,y n )

37 5.. MAXIMUM LIKELIHOOD ESTIMATION (MLE) 37 is enote the log-likelihoo function. first-orer conitions (FOC) In practice, we obtain b θ by solving the L( θ; y,y,...,y n ) θ nx log f( y i ; b θ ) θ 0. The motivation for using the log-likelihoo function is apparent since the summation form will result, after ivision by n, in estimators that are approximately averages, about which we know a lot. This avantage is particularly clear in the folowing example. Example 5. Suppose that y i i.i..n( µ, σ ), for i,,...,n. Then, f( y i µ, σ ) πσ e σ ( y i µ ), for i,,...,n. Using the likelihoo function (5.), we have L( µ, σ y,y,...,y n ) ny Next, we take the logarithm of (5.5), which gives us log L( µ, σ y,y,...,y n ) nx πσ e σ ( y i µ ). (5.5) log( πσ ) σ ( y i µ ) n log( πσ ) σ nx ( y i µ ) (5.6) We then maximize (5.6) with respect to both µ an σ. That is, we solve the following first orer conitions: (A) (B) log L( ) µ σ P n ( y i µ )0; log L( ) σ n σ + σ 4 P n ( y i µ ) 0. By solving (A), we fin that µ n P n y i y n. Solving (B) gives us σ n P n ( y i y n ). Therefore, bµ y n,anbσ n P n ( y i bµ ). Note that bσ n P n ( y i bµ ) 6 s,wheres n P n ( y i bµ ). s is the familiar unbiase estimator for σ,anbσ is a biase estimator.

38 38 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS 5. Asymptotic Behavior of MLEs 5.. Assumptions For the results we will erive in the following sections, we nee to make five assumptions:. The y i s are ii ranom variables with ensity function f( y i,θ )fori,,...,n;. log f( y i,θ ) an hence f( y i,θ ) possess erivatives with respect to θ up to the thir orer for θ Θ; 3. The range of y i is inepenent of θ hence ifferentiation uner the integral is possible; 4. The parameter vector θ is globally ientifie by the ensity function log f( y i, θ )/ θ i θ j θ k is boune in absolute value by some function H ijk ( y ) for all y an θ θ, which, in turn, has a finite expectation for all θ Θ. The first assumption is funamental an the basis of the estimator. If it is not satisfie then we are misspecifying the moel an there is little hope for obtaining correct inferences, at least in finite samples. The secon assumption is a regularity conition that is usually satisfie an easily verifie. The thir assumption is also easily verifie an guaratee to be satisfie in moels where the epenent variable has smooth an infinite support. The fourth assumption must be verifie, which is easier in some cases than others. The last assumption is crucial an bears a cost an really shoul be verifie before MLE is unertaken but is usually ignore. 5.. Some Preliminaries Now, we know that Z Z L( θ 0 y )y f( y θ 0 )y (5.7)

39 5.. ASYMPTOTIC BEHAVIOR OF MLES 39 foranyvalueofthetrueparametervectorθ 0. Therefore, 0 R L( θ 0 y )y θ Z L( θ 0 y ) y θ Z f( y θ 0 ) y θ Z log f( y θ 0 ) f( y θ 0 )y, (5.8) θ an log f( y θ 0 ) 0 E θ log L( θ 0 y ) E (5.9) θ foranyvalueofthetrueparametervectorθ 0. Differentiating (5.8) again yiels Z log f( y θ 0 ) 0 θ θ 0 f( y θ 0 )+ log f( y θ0 ) f( y θ 0 ) θ θ 0 y. (5.0) Since f( y θ 0 ) θ 0 log f( y θ0 ) θ 0 f( y θ 0 ), (5.) then we can rewrite (5.0) as log f( y θ 0 ) log f( y θ 0 ) log f( y θ 0 ) 0E θ θ 0 + E θ θ 0. (5.) or, in terms of the likelihoo function, log L( θ ϑ( θ 0 0 y ) log L( θ 0 y ) log L( θ 0 y ) )E θ θ 0 E θ θ 0. (5.3) The matrix ϑ( θ 0 ) is calle the information matrix an the relationship given in (5.3) the information matrix equality. Finally, we note that log L( θ 0 y ) E θ log L( θ 0 y ) θ 0 " X n log f i ( y θ 0 ) E θ nx log fi ( y θ 0 ) E θ nx # log f i ( y θ 0 ) θ 0 log f i ( y θ 0 ) θ 0 (5.4),

40 40 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS since the covariances between ifferent observations is zero Asymptotic Properties Consistent Root Exists Consier the case where p. Then, expaning in a Taylor s series an using the intermeiate value theorem on the quaratic term yiels log L( θ y ) n θ n + n nx + n log f( y i θ 0 ) θ nx nx log f( y i θ 0 ) θ ( b θ θ 0 ) 3 log f( y i θ ) θ 3 ( b θ θ 0 ) (5.5) where θ lies between b θ an θ 0. Now, by assumption 5, we have n nx 3 log f( y i θ ) θ 3 k nx H( y i ), (5.6) for some k <. So, where log L( θ y ) aδ + bδ + c, (5.7) n θ δ θ b θ 0, a k nx H( y i ), n b nx log f( y i θ 0 ) n θ, an c nx log f( y i θ 0 ). n θ Note that a P n n H( y i ) E[H( y i )] + o p () O p (), plim c 0,an n plim b ϑ( θ 0 ). n Now, since log L( θ y b )/ θ 0,wehaveaδ + bδ + c 0. Therearetwo possibilities. If a 6 0 with probability, which will occur when the FOC are

41 5.. ASYMPTOTIC BEHAVIOR OF MLES 4 nonlinear in a, then p δ b ± b 4ac. (5.8) a Since ac o p (), then δ 0 for the plus root while δ ϑ( θ 0 )/α for the negative root if plim a α 6 0exists. Ifa 0, then the FOC are linear n in δ whereupon δ c p b an again δ 0. If the F.O.C. are nonlinear but asymptotically linear then a p 0anac in the numerator of (5.8) will still p go to zero faster than a in the enominator an δ 0. Thus there exits at least one consistent solution θ b which satisfies plim( θ b θ 0 ) 0. (5.9) n an in the asymptotically nonlinear case there is also a possibly inconsistent solution. For the case of θ a vector, we can apply a similar style proof to show there exists a solution θ b to the FOC that satisfies plim( θ θ b 0 ) 0. An in the event n of asymptotically nonlinear FOC there is at least on other possibly inconsistent root. Global Maximum Is Consistent In the event of multiple roots, we are left with the problem of selecting between them. By assumption 4, the parameter θ is globally ientifie by the ensity function. Formally, this means that f( y, θ )f( y, θ 0 ), (5.0) for all y implies that θ θ 0.Now, Z f( y,θ ) E f( y, θ 0 ) f( y, θ ) f( y, θ 0 ) f( y,θ0 ). (5.) Thus, by Jensen s Inequality, we have f( y,θ ) f( y, θ ) E log f( y, θ 0 < log E ) f( y,θ 0 0, ) (5.) unless f( y, θ ) f( y, θ 0 ) for all y, or θ θ 0. Therefore, E [log f( y, θ )] achieves a maximum if an only if θ θ 0. However, we are solving max n nx p log f( y i, θ ) E [log f( y i, θ )], (5.3) p

42 4 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS an if we choose an inconsistent root, we will not obtain a global maximum. Thus, asymptotically, the global maximum is a consistent root. This choice of the global root has ae appeal since it is in fact the MLE among the possible alternatives an hence the choice that makes the realize ata most likely to have occure. There are complications in finite samples since the value of the likelihoo function for alternative roots may cross over as the sample size increases. That is the global maximum in small samples may not be the global maximum in larger samples. An ae problem is to ientify all the alternative roots so we can choose the global maximum. Sometimes a solution is available in a simple consistent estimator which may be use to start the nonlinear MLE optimization. Asymptotic Normality For p,wehaveaδ + bδ + c 0,so an δ b θ θ 0 n( b θ θ 0 ) c aδ + b nc a( θ b θ 0 )+b nc. a( θ b θ 0 )+b (5.4) Now since a O p () an b θ θ 0 o p (), then a( b θ θ 0 )o p () an An by the CLT we have nc n a( b θ θ 0 )+b n X log f( y i θ 0 ) θ Substituing these two results in (5.5), we fin n( θ b θ 0 ) N(0,ϑ( θ 0 ) ). p ϑ( θ 0 ). (5.5) N(0,ϑ( θ 0 )). (5.6) In general, for p>, we can apply the same scalar proof to show n( λ 0 θ b λ 0 θ 0 ) N(0, λ 0 ϑ( θ 0 ) λ) for any vector λ, which means n( θ b θ 0 ) N( 0, ϑ ( θ 0 )), (5.7) if b θ is the global maximum.

43 5.. ASYMPTOTIC BEHAVIOR OF MLES 43 Cramér-Rao Lower Boun In aition to being the covariance matrix of the MLE, ϑ ( θ 0 )efines a lower boun for covariance matrices with certain esirable properties. Let θ( e y )be any unbiase estimator, then Z E θ( e y ) eθ( y )f( y; θ )y θ, (5.8) for any unerlying θ θ 0. respect to θ yiels Differentiating both sies of this relationship with Next, we let I p E θ(, e y ) θ 0 Z eθ( y ) f( y; θ0 ) θ 0 y Z eθ( y ) log f( y; θ0 ) θ 0 f( y; θ 0 )y E eθ( y ) log f( y; θ0 ) θ 0 E ( θ( e y ) θ 0 ) log f( y; θ0 ) θ 0. (5.9) h C( θ 0 )E ( θ( e y ) θ 0 )( θ( e i y ) θ 0 ) 0. (5.30) bethecovariancematrixofθ( e y ), then, µ µ eθ( y ) C( θ 0 ) I Cov log L p I θ p ϑ( θ 0 ), (5.3) where I p is a p p ientity matrix, an (5.3) as a covariance matrix is positive semiefinite. Now, for any (p ) vector a, wehave a 0 a 0 ϑ( θ 0 ) µ C( θ 0 ) I p I p ϑ( θ 0 ), µ a 0 a 0 ϑ( θ 0 ) a 0 [C(θ 0 ) ϑ( θ 0 ) ]a 0. (5.3) Thus, any unbiase estimator θ( e y ) has a covariance matrix that excees ϑ( θ 0 ) by a positive semiefinite matrix. An if the MLE estimator is unbiase, it is efficient within the class of unbiase estimators. Likewise, any CUAN estimator will have a covariance exceeing ϑ( θ 0 ). Since the asymptotic covariance of MLE is, in fact, ϑ( θ 0 ),itisefficient (asymptotically). ϑ( θ 0 ) is calle the Camér-Rao lower boun.

44 44 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS 5.3 Maximum Likelihoo Inference 5.3. Likelihoo Ratio Test Suppose we wish to test H 0 : θ θ 0 against H 0 : θ 6 θ 0. Then, we efine an L u max θ L( θ y )L( b θ y ) (5.33) L r L( θ 0 y ), (5.34) where L u is the unrestricte likelihoo an L r is the restricte likelihoo. We then form the likelihoo ratio λ L r. (5.35) L u Note that the restricte likelihoo can be no larger than the unrestricte which maximizes the function. As with estimation, it is more convenient to work with the logs of the likelihoo functions. It will be shown below that, uner H 0, LR logλ log L r L u [L( b θ y ) L( θ 0 y )] χ p, (5.36) where θ b is the unrestricte MLE, an θ 0 is the restricte MLE. If H applies, then LR O p ( n ). Large values of this statistic inicate that the restrictions make the observe values much less likely than the unrestricte an we prefer the unrestricte an reject the restictions. In general, for H 0 : r( θ ) 0, an H : r( θ ) 6 0,wehave an Uner H 0, L u maxl( θ y )L( θ b y ), (5.37) θ L r maxl( θ y )s.t. r( θ )0 θ L( θ e y ). (5.38) LR[L( b θ y ) L( e θ y )] χ q, (5.39) where q is the length of r( ). Note that in the general case, the likelihoo ratio test requires calculation of both the restricte an the unrestricte MLE.

45 5.3. MAXIMUM LIKELIHOOD INFERENCE Wal Test TheasymptoticnormalityofMLEmaybeusetoobtainatestbaseonlyon the unrestricte estimates. Now, uner H 0 : θ θ 0,wehave n( b θ θ 0 ) N(0,ϑ ( θ 0 )). (5.40) Thus, using the results on the asymptotic behavior of quaratic forms from the previous chapter, we have Wn( b θ θ 0 ) 0 ϑ( θ 0 )( b θ θ 0 ) χ p, (5.4) which is the Wal test. As we iscusse for quaratic tests, in general, uner H 0 : θ 6 θ 0,wewoulhaveWO p ( n ). In practice, since we use L( θ y b ) n θ θ 0 nx log f( θ y b ) p n θ θ 0 ϑ( θ 0 ), (5.4) W ( θ b θ 0 ) 0 L( θ y b ) θ θ 0 ( θ b θ 0 ) χ p. (5.43) Asie from having the same asymptotic istribution, the Likelihoo Ratio an Wal tests are asymptotically equivalent in the sense that plim(lr W )0. (5.44) n This is shown by expaning L( θ 0 y ) in a Taylor s series about b θ.thatis, L( θ 0 ) L( θ b )+ L( θ b ) θ ( θ0 θ b ) + ( θ0 θ b ) 0 L( θ b ) θ θ 0 ( θ 0 θ b ) + P P P 3 L( θ ) (θi 0 θ 6 θ i θ j θ b i )(θj 0 θ b j )(θk 0 θ b k ). (5.45) k i j k where the thir line applies the intermeiate value theorem for θ between θ b an θ 0. Now L( θ ) θ 0, an the thir line can be shown to be O p (/ n) uner assumption 5, whereupon we have L( b θ ) L( θ 0 ) ( θ0 b θ ) 0 L( b θ ) θ θ 0 ( θ 0 b θ )+O p (/ n ) (5.46)

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

under the null hypothesis, the sign test (with continuity correction) rejects H 0 when α n + n 2 2.

under the null hypothesis, the sign test (with continuity correction) rejects H 0 when α n + n 2 2. Assignment 13 Exercise 8.4 For the hypotheses consiere in Examples 8.12 an 8.13, the sign test is base on the statistic N + = #{i : Z i > 0}. Since 2 n(n + /n 1) N(0, 1) 2 uner the null hypothesis, the

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing

More information

Jointly continuous distributions and the multivariate Normal

Jointly continuous distributions and the multivariate Normal Jointly continuous istributions an the multivariate Normal Márton alázs an álint Tóth October 3, 04 This little write-up is part of important founations of probability that were left out of the unit Probability

More information

A Modification of the Jarque-Bera Test. for Normality

A Modification of the Jarque-Bera Test. for Normality Int. J. Contemp. Math. Sciences, Vol. 8, 01, no. 17, 84-85 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1988/ijcms.01.9106 A Moification of the Jarque-Bera Test for Normality Moawa El-Fallah Ab El-Salam

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

New Statistical Test for Quality Control in High Dimension Data Set

New Statistical Test for Quality Control in High Dimension Data Set International Journal of Applie Engineering Research ISSN 973-456 Volume, Number 6 (7) pp. 64-649 New Statistical Test for Quality Control in High Dimension Data Set Shamshuritawati Sharif, Suzilah Ismail

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy, NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Final Exam Study Guide and Practice Problems Solutions

Final Exam Study Guide and Practice Problems Solutions Final Exam Stuy Guie an Practice Problems Solutions Note: These problems are just some of the types of problems that might appear on the exam. However, to fully prepare for the exam, in aition to making

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson JUST THE MATHS UNIT NUMBER 10.2 DIFFERENTIATION 2 (Rates of change) by A.J.Hobson 10.2.1 Introuction 10.2.2 Average rates of change 10.2.3 Instantaneous rates of change 10.2.4 Derivatives 10.2.5 Exercises

More information

LECTURE NOTES ON DVORETZKY S THEOREM

LECTURE NOTES ON DVORETZKY S THEOREM LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Notes on Lie Groups, Lie algebras, and the Exponentiation Map Mitchell Faulk

Notes on Lie Groups, Lie algebras, and the Exponentiation Map Mitchell Faulk Notes on Lie Groups, Lie algebras, an the Exponentiation Map Mitchell Faulk 1. Preliminaries. In these notes, we concern ourselves with special objects calle matrix Lie groups an their corresponing Lie

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

Differentiation ( , 9.5)

Differentiation ( , 9.5) Chapter 2 Differentiation (8.1 8.3, 9.5) 2.1 Rate of Change (8.2.1 5) Recall that the equation of a straight line can be written as y = mx + c, where m is the slope or graient of the line, an c is the

More information

Research Article When Inflation Causes No Increase in Claim Amounts

Research Article When Inflation Causes No Increase in Claim Amounts Probability an Statistics Volume 2009, Article ID 943926, 10 pages oi:10.1155/2009/943926 Research Article When Inflation Causes No Increase in Claim Amounts Vytaras Brazauskas, 1 Bruce L. Jones, 2 an

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

Pure Further Mathematics 1. Revision Notes

Pure Further Mathematics 1. Revision Notes Pure Further Mathematics Revision Notes June 20 2 FP JUNE 20 SDB Further Pure Complex Numbers... 3 Definitions an arithmetical operations... 3 Complex conjugate... 3 Properties... 3 Complex number plane,

More information

Consistency and asymptotic normality

Consistency and asymptotic normality Consistency an asymtotic normality Class notes for Econ 842 Robert e Jong March 2006 1 Stochastic convergence The asymtotic theory of minimization estimators relies on various theorems from mathematical

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences. S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

A. Incorrect! The letter t does not appear in the expression of the given integral

A. Incorrect! The letter t does not appear in the expression of the given integral AP Physics C - Problem Drill 1: The Funamental Theorem of Calculus Question No. 1 of 1 Instruction: (1) Rea the problem statement an answer choices carefully () Work the problems on paper as neee (3) Question

More information

Introduction to Markov Processes

Introduction to Markov Processes Introuction to Markov Processes Connexions moule m44014 Zzis law Gustav) Meglicki, Jr Office of the VP for Information Technology Iniana University RCS: Section-2.tex,v 1.24 2012/12/21 18:03:08 gustav

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS MARK SCHACHNER Abstract. When consiere as an algebraic space, the set of arithmetic functions equippe with the operations of pointwise aition an

More information

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize

More information

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1 Lecture 5 Some ifferentiation rules Trigonometric functions (Relevant section from Stewart, Seventh Eition: Section 3.3) You all know that sin = cos cos = sin. () But have you ever seen a erivation of

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

arxiv: v4 [math.pr] 27 Jul 2016

arxiv: v4 [math.pr] 27 Jul 2016 The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim QF101: Quantitative Finance September 5, 2017 Week 3: Derivatives Facilitator: Christopher Ting AY 2017/2018 I recoil with ismay an horror at this lamentable plague of functions which o not have erivatives.

More information

Chapter 6: Energy-Momentum Tensors

Chapter 6: Energy-Momentum Tensors 49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

A. Exclusive KL View of the MLE

A. Exclusive KL View of the MLE A. Exclusive KL View of the MLE Lets assume a change-of-variable moel p Z z on the ranom variable Z R m, such as the one use in Dinh et al. 2017: z 0 p 0 z 0 an z = ψz 0, where ψ is an invertible function

More information

Calculus in the AP Physics C Course The Derivative

Calculus in the AP Physics C Course The Derivative Limits an Derivatives Calculus in the AP Physics C Course The Derivative In physics, the ieas of the rate change of a quantity (along with the slope of a tangent line) an the area uner a curve are essential.

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

Mathematical Review Problems

Mathematical Review Problems Fall 6 Louis Scuiero Mathematical Review Problems I. Polynomial Equations an Graphs (Barrante--Chap. ). First egree equation an graph y f() x mx b where m is the slope of the line an b is the line's intercept

More information

PDE Notes, Lecture #11

PDE Notes, Lecture #11 PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =

More information

1 Math 285 Homework Problem List for S2016

1 Math 285 Homework Problem List for S2016 1 Math 85 Homework Problem List for S016 Note: solutions to Lawler Problems will appear after all of the Lecture Note Solutions. 1.1 Homework 1. Due Friay, April 8, 016 Look at from lecture note exercises:

More information

REAL ANALYSIS I HOMEWORK 5

REAL ANALYSIS I HOMEWORK 5 REAL ANALYSIS I HOMEWORK 5 CİHAN BAHRAN The questions are from Stein an Shakarchi s text, Chapter 3. 1. Suppose ϕ is an integrable function on R with R ϕ(x)x = 1. Let K δ(x) = δ ϕ(x/δ), δ > 0. (a) Prove

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Capacity Analysis of MIMO Systems with Unknown Channel State Information Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,

More information

Section 7.2. The Calculus of Complex Functions

Section 7.2. The Calculus of Complex Functions Section 7.2 The Calculus of Complex Functions In this section we will iscuss limits, continuity, ifferentiation, Taylor series in the context of functions which take on complex values. Moreover, we will

More information

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes Fin these erivatives of these functions: y.7 Implicit Differentiation -- A Brief Introuction -- Stuent Notes tan y sin tan = sin y e = e = Write the inverses of these functions: y tan y sin How woul we

More information

1. Aufgabenblatt zur Vorlesung Probability Theory

1. Aufgabenblatt zur Vorlesung Probability Theory 24.10.17 1. Aufgabenblatt zur Vorlesung By (Ω, A, P ) we always enote the unerlying probability space, unless state otherwise. 1. Let r > 0, an efine f(x) = 1 [0, [ (x) exp( r x), x R. a) Show that p f

More information

Math 1271 Solutions for Fall 2005 Final Exam

Math 1271 Solutions for Fall 2005 Final Exam Math 7 Solutions for Fall 5 Final Eam ) Since the equation + y = e y cannot be rearrange algebraically in orer to write y as an eplicit function of, we must instea ifferentiate this relation implicitly

More information

On conditional moments of high-dimensional random vectors given lower-dimensional projections

On conditional moments of high-dimensional random vectors given lower-dimensional projections Submitte to the Bernoulli arxiv:1405.2183v2 [math.st] 6 Sep 2016 On conitional moments of high-imensional ranom vectors given lower-imensional projections LUKAS STEINBERGER an HANNES LEEB Department of

More information

Econometrics I. September, Part I. Department of Economics Stanford University

Econometrics I. September, Part I. Department of Economics Stanford University Econometrics I Deartment of Economics Stanfor University Setember, 2008 Part I Samling an Data Poulation an Samle. ineenent an ientical samling. (i.i..) Samling with relacement. aroximates samling without

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE Journal of Soun an Vibration (1996) 191(3), 397 414 THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE E. M. WEINSTEIN Galaxy Scientific Corporation, 2500 English Creek

More information

Some Examples. Uniform motion. Poisson processes on the real line

Some Examples. Uniform motion. Poisson processes on the real line Some Examples Our immeiate goal is to see some examples of Lévy processes, an/or infinitely-ivisible laws on. Uniform motion Choose an fix a nonranom an efine X := for all (1) Then, {X } is a [nonranom]

More information

Lecture 5. Symmetric Shearer s Lemma

Lecture 5. Symmetric Shearer s Lemma Stanfor University Spring 208 Math 233: Non-constructive methos in combinatorics Instructor: Jan Vonrák Lecture ate: January 23, 208 Original scribe: Erik Bates Lecture 5 Symmetric Shearer s Lemma Here

More information

A Weak First Digit Law for a Class of Sequences

A Weak First Digit Law for a Class of Sequences International Mathematical Forum, Vol. 11, 2016, no. 15, 67-702 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1288/imf.2016.6562 A Weak First Digit Law for a Class of Sequences M. A. Nyblom School of

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information

Applications of the Wronskian to ordinary linear differential equations

Applications of the Wronskian to ordinary linear differential equations Physics 116C Fall 2011 Applications of the Wronskian to orinary linear ifferential equations Consier a of n continuous functions y i (x) [i = 1,2,3,...,n], each of which is ifferentiable at least n times.

More information

ELEC3114 Control Systems 1

ELEC3114 Control Systems 1 ELEC34 Control Systems Linear Systems - Moelling - Some Issues Session 2, 2007 Introuction Linear systems may be represente in a number of ifferent ways. Figure shows the relationship between various representations.

More information

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number

More information

Zachary Scherr Math 503 HW 3 Due Friday, Feb 12

Zachary Scherr Math 503 HW 3 Due Friday, Feb 12 Zachary Scherr Math 503 HW 3 Due Friay, Feb 1 1 Reaing 1. Rea sections 7.5, 7.6, 8.1 of Dummit an Foote Problems 1. DF 7.5. Solution: This problem is trivial knowing how to work with universal properties.

More information

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009 Spurious Significance of reatment Effects in Overfitte Fixe Effect Moels Albrecht Ritschl LSE an CEPR March 2009 Introuction Evaluating subsample means across groups an time perios is common in panel stuies

More information

Witt#5: Around the integrality criterion 9.93 [version 1.1 (21 April 2013), not completed, not proofread]

Witt#5: Around the integrality criterion 9.93 [version 1.1 (21 April 2013), not completed, not proofread] Witt vectors. Part 1 Michiel Hazewinkel Sienotes by Darij Grinberg Witt#5: Aroun the integrality criterion 9.93 [version 1.1 21 April 2013, not complete, not proofrea In [1, section 9.93, Hazewinkel states

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Some functions and their derivatives

Some functions and their derivatives Chapter Some functions an their erivatives. Derivative of x n for integer n Recall, from eqn (.6), for y = f (x), Also recall that, for integer n, Hence, if y = x n then y x = lim δx 0 (a + b) n = a n

More information

Math 115 Section 018 Course Note

Math 115 Section 018 Course Note Course Note 1 General Functions Definition 1.1. A function is a rule that takes certain numbers as inputs an assigns to each a efinite output number. The set of all input numbers is calle the omain of

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

0.1 Differentiation Rules

0.1 Differentiation Rules 0.1 Differentiation Rules From our previous work we ve seen tat it can be quite a task to calculate te erivative of an arbitrary function. Just working wit a secon-orer polynomial tings get pretty complicate

More information

Math 1B, lecture 8: Integration by parts

Math 1B, lecture 8: Integration by parts Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores

More information

Two formulas for the Euler ϕ-function

Two formulas for the Euler ϕ-function Two formulas for the Euler ϕ-function Robert Frieman A multiplication formula for ϕ(n) The first formula we want to prove is the following: Theorem 1. If n 1 an n 2 are relatively prime positive integers,

More information

Modeling of Dependence Structures in Risk Management and Solvency

Modeling of Dependence Structures in Risk Management and Solvency Moeling of Depenence Structures in Risk Management an Solvency University of California, Santa Barbara 0. August 007 Doreen Straßburger Structure. Risk Measurement uner Solvency II. Copulas 3. Depenent

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS

TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS MISN-0-4 TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS f(x ± ) = f(x) ± f ' (x) + f '' (x) 2 ±... 1! 2! = 1.000 ± 0.100 + 0.005 ±... TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS by Peter Signell 1.

More information

4. Distributions of Functions of Random Variables

4. Distributions of Functions of Random Variables 4. Distributions of Functions of Random Variables Setup: Consider as given the joint distribution of X 1,..., X n (i.e. consider as given f X1,...,X n and F X1,...,X n ) Consider k functions g 1 : R n

More information

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation Relative Entropy an Score Function: New Information Estimation Relationships through Arbitrary Aitive Perturbation Dongning Guo Department of Electrical Engineering & Computer Science Northwestern University

More information

LeChatelier Dynamics

LeChatelier Dynamics LeChatelier Dynamics Robert Gilmore Physics Department, Drexel University, Philaelphia, Pennsylvania 1914, USA (Date: June 12, 28, Levine Birthay Party: To be submitte.) Dynamics of the relaxation of a

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

Basic Thermoelasticity

Basic Thermoelasticity Basic hermoelasticity Biswajit Banerjee November 15, 2006 Contents 1 Governing Equations 1 1.1 Balance Laws.............................................. 2 1.2 he Clausius-Duhem Inequality....................................

More information

A Sketch of Menshikov s Theorem

A Sketch of Menshikov s Theorem A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Calculus of Variations

Calculus of Variations Calculus of Variations Lagrangian formalism is the main tool of theoretical classical mechanics. Calculus of Variations is a part of Mathematics which Lagrangian formalism is base on. In this section,

More information

Math 300 Winter 2011 Advanced Boundary Value Problems I. Bessel s Equation and Bessel Functions

Math 300 Winter 2011 Advanced Boundary Value Problems I. Bessel s Equation and Bessel Functions Math 3 Winter 2 Avance Bounary Value Problems I Bessel s Equation an Bessel Functions Department of Mathematical an Statistical Sciences University of Alberta Bessel s Equation an Bessel Functions We use

More information

International Journal of Pure and Applied Mathematics Volume 35 No , ON PYTHAGOREAN QUADRUPLES Edray Goins 1, Alain Togbé 2

International Journal of Pure and Applied Mathematics Volume 35 No , ON PYTHAGOREAN QUADRUPLES Edray Goins 1, Alain Togbé 2 International Journal of Pure an Applie Mathematics Volume 35 No. 3 007, 365-374 ON PYTHAGOREAN QUADRUPLES Eray Goins 1, Alain Togbé 1 Department of Mathematics Purue University 150 North University Street,

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information