What Is Econometrics. Chapter Data Accounting Nonexperimental Data Types
|
|
- Charles Hoover
- 6 years ago
- Views:
Transcription
1 Chapter What Is Econometrics. Data.. Accounting Definition. Accounting ata is routinely recore ata (that is, recors of transactions) as part of market activities. Most of the ata use in econmetrics is accounting ata. Acounting ata has many shortcomings, since it is not collecte specifically for the purposes of the econometrician. An econometrician woul prefer ata collecte in experiments or gathere for her... Nonexperimental The econometrician oes not have any control over nonexperimental ata. This causes various problems. Econometrics is use to eal with or solve these problems...3 Data Types Definition. Time-series ata are ata that represent repeate observations of some variable in subseqent time perios. A time-series variable is often subscripte with the letter t. Definition.3 Cross-sectional ata are ata that represent a set of observations of some variable at one specific instant over several agents. A crosssectional variable is often subscripte with the letter i. Definition.4 Time-series cross-sectional ata are ata that are both timeseries an cross-sectional.
2 CHAPTER. WHAT IS ECONOMETRICS An special case of time-series cross-sectional ata is panel ata.panel ata areobservationsofthesamesetofagentsovertime...4 Empirical Regularities We nee to look at the ata to etect regularities. Often, we use stylize facts, but this can lea to over-simplifications.. Moels Moels are simplifications of the real worl. The ata we use in our moel is what motivates theory... Economic Theory By choosing assumptions, the.. Postulate Relationships Moels can also be evelope by postulating relationships among the variables...3 Equations Economic moels are usually state in terms of one or more equations...4 Error Component Because none of our moels are exactly correct, we inclue an error component into our equations, usually enote u i. In econometrics, we usually assume that the error component is stochastic (that is, ranom). It is important to note that the error component cannot be moele by economic theory. We impose assumptions on u i, an as econometricians, focus on u i..3 Statistics.3. Stochastic Component Some stuff here..3. Statistical Analysis Some stuff here.
3 .4. INTERACTION Tasks There are three main tasks of econometrics:. estimating parameters;. hypothesis testing; 3. forecasting. Forecasting is perhaps the main reason for econometrics..3.4 Econometrics Since our ata is, in general, nonexperimental, econometrics makes use of economic theory to ajust for the lack of proper ata..4 Interaction.4. Scientific Metho Econometrics uses the scientific metho, but the ata are nonexperimental. In some sense, this is similar to astronomers, who gather ata, but cannot conuct experiments (for example, astronomers preict the existence of black holes, but have never mae one in a lab)..4. Role Of Econometrics The three components of econometrics are:. theory;. statistics; 3. ata. These components are interepenent, an each helps shape the others..4.3 Ocam s Razor Often in econometrics, we are face with the problem of choosing one moel over an alternative. The simplest moel that fits the ata is often the best choice.
4 Chapter Some Useful Distributions. Introuction.. Inferences Statistical Statements As statisticians, we are often calle upon to answer questions or make statements concerning certain ranom variables. For example: is a coin fair (i.e. is the probability of heas 0.5) or what is the expecte value of GNP for the quarter. Population Distribution Typically, answering such questions requires knowlege of the istribution of the ranom variable. Unfortunately, we usually o not know this istribution (althoughwemayhaveastrongiea,asinthecaseofthecoin). Experimentation In orer to gain knowlege of the istribution, we raw several realizations of the ranom variable. The notion is that the observations in this sample contain information concerning the population istribution. Inference Definition. The process by which we make statements concerning the population istribution base on the sample observations is calle inference. Example. We ecie whether a coin is fair by tossing it several times an observing whether it seems to be heas about half the time. 4
5 .. INTRODUCTION 5.. Ranom Samples Suppose we raw n observations of a ranom variable, enote {x,x,..., x n } an each x i is inepenent an has the same (marginal) istribution, then {x,x,..., x n } constitute a simple ranom sample. Example. We toss a coin three times. Supposely, the outcomes are inepenent. If x i countsthenumberofheasfortossi,thenwehaveasimple ranom sample. Note that not all samples are simple ranom. Example.3 Weareinteresteintheincomelevelforthepopulationingeneral. The n observations available in this class are not inentical since the higher income iniviuals will ten to be more variable. Example.4 Consier the aggregate consumption level. The n observations available in this set are not inepenent since a high consumption level in one perio is usually followe by a high level in the next...3 Sample Statistics Definition. Any function of the observations in the sample which is the basis for inference is calle a sample statistic. Example.5 In the coin tossing experiment, let S count the total number of heas an P S 3 count the sample proportion of heas. Both S an P are sample statistics...4 Sample Distributions A sample statistic is a ranom variable its value will vary from one experiment to another. As a ranom variable, it is subject to a istribution. Definition.3 The istribution of the sample statistic is the sample istribution of the statistic. Example.6 The statistic S introuce above has a multinomial sample istribution. Specifically Pr(S 0) /8, Pr(S ) 3/8, Pr(S ) 3/8, an Pr(S 3) /8.
6 6 CHAPTER. SOME USEFUL DISTRIBUTIONS. Normality An The Sample Mean.. Sample Sum Consier the simple ranom sample {x,x,..., x n },wherex measures the height of an ault female. We will assume that E x i µ an Var(x i )σ, for all i,,..., n Let S x + x + + x n enote the sample sum. Now, Also, E S E( x + x + + x n )E x + E x + + E x n nµ (.) Var(S) E( S E S ) E( S nµ ) E( x + x + + x n nµ ) " n # X E ( x i µ ) E[( x µ ) +(x µ )( x µ )+ + ( x µ )( x µ )+(x µ ) + +(x n µ ) ] nσ. (.) Note that E[( x i µ )( x j µ )] 0 by inepenence... Sample Mean Let x S n enote the sample mean or average...3 Moments Of The Sample Mean The mean of the sample mean is Thevarianceofthesamplemeanis E x E S n n E S nµ µ. (.3) n Var(x) E(x µ) E( S n µ ) E[ ( S nµ )] n n E( S nµ )
7 .3. THE NORMAL DISTRIBUTION 7 n nσ..4 Sampling Distribution σ n. (.4) We have been able to establish the mean an variance of the sample mean. However, in orer to know its complete istribution precisely, we must know the probability ensity function (pf) of the ranom variable x..3 The Normal Distribution.3. Density Function Definition.4 A continuous ranom variable x i with the ensity function f ( x i ) πσ e σ ( x i µ ) (.5) follows the normal istribution, where µ an σ are the mean an variance of x i, respectively. Since the istribution is characterize by the two parameters µ an σ,we enote a normal ranom variable by x i N ( µ, σ ). The normal ensity function is the familiar bell-shape curve, as is shown in Figure. for µ 0 an σ. It is symmetric about the mean µ. Approximately 3 of the probability mass lies within ±σ of µ an about.95 lies within ±σ. There are numerous examples of ranom variables that have this shape. Many economic variables are assume to be normally istribute..3. Linear Transformation Consier the transforme ranom variable We know that an Y i a + bx i µ Y E Y i a + bµ x σ Y E(Y i µ Y ) b σ x If x i is normally istribute, then Y i is normally istribute as well. That is, Y i N ( µ Y,σ Y )
8 8 CHAPTER. SOME USEFUL DISTRIBUTIONS Figure.: The Stanar Normal Distribution Moreover, if x i N ( µ x,σ x )anz i N ( µ z,σ z ) are inepenent, then Y i a + bx i + cz i N ( a + bµ x + cµ z,b σ x + c σ z ) These results will be formally emonstrate in a more general setting in the next chapter..3.3 Distribution Of The Sample Mean If, for each i,,...,n,thex i s are inepenent, ientically istribute (ii) normal ranom variables, then.3.4 The Stanar Normal x i N ( µ x, σ x n ) (.6) The istribution of x will vary with ifferent values of µ x an σ x,whichis inconvenient. Rather than ealing with a unique istribution for each case, we
9 .4. THE CENTRAL LIMIT THEOREM 9 perform the following transformation: Now, Z x µ x q σx n x q µ q x σx n σx n E Z q E x σx n µ q x σx n 0. µ q x σx n µ q x σx n Also, Var( Z ) E q x µ q x σx σx n n n E ( x µ x ) σ x n σx. σ x n Thus Z N(0, ). The N(0, ) istribution is the stanar normal an is welltabulate. The probability ensity function for the stanar normal istribution is f ( z i ) π e ( z i ) ϕ(z i ) (.8).4 The Central Limit Theorem.4. Normal Theory The normal ensity has a prominent position in statistics. This is not only because many ranom variables appear to be normal, but also because most any sample mean appears normal as the sample size increases. Specifically, suppose x,x,...,x n is a simple ranom sample an E x i µ x an Var( x i )σx,thenasn, the istribution of x becomes normal. That
10 0 CHAPTER. SOME USEFUL DISTRIBUTIONS is, lim n f x µ x q σ x n ϕ x µ x q (.9) σx n.5 Distributions Associate With The Normal Distribution.5. The Chi-Square Distribution Definition.5 Suppose that Z,Z,...,Z n is a simple ranom sample, an Z i N(0, ). Then nx Zi Xn, (.0) where n are the egrees of freeom of the Chi-square istribution. The probability ensity function for the X n is f χ ( x ) n/ Γ( n/) xn/ e x/,x>0 (.) where Γ(x) is the gamma function. See Figure.. If x,x,...,x n is a simple ranom sample, an x i N ( µ x,σ x ), then nx µ xi µ σ X n. (.) The chi-square istribution will prove useful in testing hypotheses on both the variance of a single variable an the (conitional) means of several. This multivariate usage will be explore in the next chapter. Example.7 Consier the estimate of σ P n s ( x i x ). n Then ( n ) s σ X n. (.3)
11 .5. DISTRIBUTIONS ASSOCIATED WITH THE NORMAL DISTRIBUTION Figure.: Some Chi-Square Distributions.5. The t Distribution Definition.6 Suppose that Z N(0, ), Y Xk, an that Z an Y are inepenent. Then Z q t k, (.4) Y k where k are the egrees of freeom of the t istribution. The probability ensity function for a t ranom variable with n egrees of freeom is Γ n+ f t ( x ) nπ Γ n + x (n+)/, (.5) n for <x<. See Figure.3 The t (also known as Stuent s t) istribution, is name after W.S. Gosset, who publishe uner the pseuonym Stuent. It is useful in testing hypotheses concerning the (conitional) mean when the variance is estimate. Example.8 Consier the sample mean from a simple ranom sample of normals. We know that x N( µ, σ /n )an Z x q µ N(0, ). σ n
12 CHAPTER. SOME USEFUL DISTRIBUTIONS Figure.3: Some t Distributions Also, we know that Y (n ) s σ X n, where s is the unbiase estimator of σ. Thus, if Z an Y are inepenent (which, in fact, is the case), then Z q Y ( n ) x µ σ n r ( n ) s σ ( n ) s n σ (x µ ) x q µ s n s σ t n (.6)
13 .5. DISTRIBUTIONS ASSOCIATED WITH THE NORMAL DISTRIBUTION3.5.3 The F Distribution Definition.7 Suppose that Y Xm, W Xn,anthatY an W are inepenent. Then Y F m,n, (.7) where m,n are the egrees of freeom of the F istribution. m W n The probability ensity function for a F ranom variable with m an n egrees of freeom is f F ( x ) Γ m+n (m/n) m/ Γ n Γ m x (m/) ( + mx/n) (m+n)/ (.8) The F istribution is name after the great statistician Sir Ronal A. Fisher, an is use in many applications, most notably in the analysis of variance. This situation will arise when we seek to test multiple (conitional) mean parameters with estimate variance. Note that when x t n then x F,n. Some examples of the F istribution can be seen in Figure.4. Figure.4: Some F Distributions
14 Chapter 3 Multivariate Distributions 3. Matrix Algebra Of Expectations 3.. Moments of Ranom Vectors Let x x. x m x be an m vector-value ranom variable. Each element of the vector is a scalar ranom variable of the type iscusse in the previous chapter. The expectation of a ranom vector is E[ x ] µ E[ x ] E[ x ]. µ. µ. (3.) E[ x m ] µ m Note that µ is also an m column vector. We see that the mean of the vector isthevectorofthemeans. Next, we evaluate the following: E[( x µ )( x µ ) 0 ] E ( x µ ) ( x µ )( x µ ) ( x µ )( x m µ m ) ( x µ )( x µ ) (x µ ) ( x µ )( x m µ m ) ( x m µ m )( x µ ) (x m µ m )( x µ ) ( x m µ m ) 4
15 3.. MATRIX ALGEBRA OF EXPECTATIONS 5 σ σ σ m σ σ σ m σ m σ m σ mm Σ. (3.) Σ, the covariance matrix, is an m m matrix of variance an covariance terms. The variance σ i σ ii of x i is along the iagonal, while the cross-prouct terms represent the covariance between x i an x j. 3.. Properties Of The Covariance Matrix Symmetric The variance-covariance matrix Σ is a symmetric matrix. This can be shown by noting that σ ij E( x i µ i )( x j µ j )E( x j µ j )( x i µ i )σ ji. Due to this symmetry Σ will only have m(m +)/ unique elements. Positive Semiefinite Σ is a positive semiefinite matrix. Recall that any m m matrix is positive semiefinite if an only if it meets any of the following three equivalent conitions:. All the principle minors are nonnegative;. λ 0 Σλ 0, for all {z} λ 6 0; m 3. Σ PP 0,forsome {z} P. m m The first conition (actually we use negative efiniteness) is useful in the stuy of utility maximization while the latter two are useful in econometric analysis. The secon conition is the easiest to emonstrate in the current context. Let λ 6 0. Then, we have λ 0 Σλ λ 0 E[( x µ )( x µ ) 0 ]λ E[λ 0 ( x µ )( x µ ) 0 λ ] E{ [ λ 0 ( x µ )] } 0,
16 6 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS since the term insie the expectation is a quaratic. Hence, Σ is a positive semiefinite matrix. Note that P satisfying the thir relationship is not unique. Let D be any m m orthonormal martix, then DD 0 I m an P PD yiels P P 0 PDD 0 P 0 PI m P 0 Σ. Usually, we will choose P to be an upper or lower triangular matrix with m(m + )/ nonzero elements. Positive Definite Since Σ is a positive semiefinite matrix, it will be a positive efinite matrix if an only if et(σ) 6 0. Now, we know that Σ PP 0 for some m m matrix P. Thisimpliesthatet(P) Linear Transformations Let y {z} m b {z} m m z} { + {z} B x.then m m E[ y ] b + B E[ x ] b + Bµ µ y (3.3) Thus, the mean of a linear transformation is the linear transformation of the mean. Next, we have E[(y µ y )( y µ y ) 0 ] E{[ B ( x µ )][ (B ( x µ )) 0 ]} B E[( x µ )( x µ) 0 ]B 0 BΣB 0 BΣB 0 (3.4) Σ y (3.5) whereweusetheresult(abc ) 0 C 0 B 0 A 0, if conformability hols. 3. Change Of Variables 3.. Univariate Let x be a ranom variable an f x ( ) be the probability ensity function of x. Now, efine y h(x), where h 0 ( x ) h( x ) x > 0.
17 3.. CHANGE OF VARIABLES 7 That is, h( x ) is a strictly monotonically increasing function an so y is a oneto-one transformation of x. Now, we woul like to know the probability ensity function of y, f y (y). To finit,wenotethat an, Pr( y h( a ))Pr(x a ), (3.6) Pr( x a ) Pr( y h( a )) Z a Z h( a ) f x ( x ) x F x ( a ), (3.7) f y ( y ) y F y ( h( a )), (3.8) for all a. Assuming that the cumulative ensity function is ifferentiable, we use (3.6) to combine (3.7) an (3.8), an take the total ifferential, which gives us F x ( a ) F y ( h( a )) f x ( a )a f y ( h( a ))h 0 ( a )a for all a. Thus, for a small perturbation, f x ( a )f y ( h( a ))h 0 ( a ) (3.9) for all a. Also, sincey is a one-to-one transformation of x, weknowthath( ) can be inverte. That is, x h (y). Thus, a h (y), an we can rewrite (3.9) as f x ( h ( y )) f y ( y )h 0 ( h ( y )). Therefore, the probability ensity function of y is f y ( y )f x ( h ( y )) h 0 ( h ( y )). (3.0) Note that f y ( y ) has the properties of being nonnegative, since h 0 ( ) > 0. If h 0 ( ) < 0, (3.0) can be correcte by taking the absolute value of h 0 ( ), which will assure that we have only positive values for our probability ensity function. 3.. Geometric Interpretation Consier the graph of the relationship shown in Figure 3.. We know that Pr[ h( b ) >y>h( a )] Pr( b>x>a). Also, we know that Pr[ h( b ) >y>h( a )] ' f y [ h( b )][ h( b ) h( a )],
18 8 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS y h(b) h(a) h(x) a b x Figure 3.: Change of Variables an So, Pr( b>x>a) ' f x ( b )( b a ). f y [ h( b )][ h( b ) h( a )] ' f x ( b )( b a ) f y [ h( b )] ' f x ( b ) [ h( b ) h( a )]/( b a )] (3.) Now, as we let a b, the enominator of (3.) approaches h 0 ( ). This is then the same formula as (3.0) Multivariate Let {z} x f x ( x ). m Define a one-to-one transformation m z} { y h ( x ). {z} m Since h( ) is a one-to-one transformation, it has an inverse: x h ( y ).
19 3.. CHANGE OF VARIABLES 9 We also assume that h( x ) x 0 h( x ) x 0 exists. This is the m m Jacobian matrix, where h ( x ) h ( x ). h m ( x ) (x x x m ) h ( x ) h ( x ) x x h ( x ) h ( x ) x x h ( x ) h ( x ) x m x m h m( x ) x h m ( x ) x h m( x ) x m J x ( x ) (3.) Given this notation, the multivariate analog to (3.) can be shown to be f y ( y )f x [ h ( y )] et(j x [ h ( y )]) (3.3) Since h( ) isifferentiable an one-to-one then et(j x [ h ( y )]) 6 0. Example 3. Let y b 0 + b x,wherex, b 0,anb are scalars. Then x y b 0 b an Therefore, y x b. f y ( y )f x µ y b0 b b. Example 3. Let y b + Bx, wherey is an m vector an et(b) 6 0. Then x B ( y b ) an Thus, y x 0 B J x( x ). f y ( y )f x B ( y b ) et( B ).
20 0 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS 3.3 Multivariate Normal Distribution 3.3. Spherical Normal Distribution Definition 3. An m ranomvectorz is sai to be spherically normally istribute if f(z) (π) n/ e z 0z. Such a ranom vector can be seen to be a vector of inepenent stanar normals. Let z,z,...,z m, be i.i.. ranom variables such that z i N(0, ). That is, z i has pf given in (.8), for i,..., m. Then, by inepenence, the joint istribution of the z i s is given by where z 0 (z z... z m ). f( z,z,...,z m ) f( z )f( z ) f( z m ) ny e z i π 3.3. Multivariate Normal Definition 3. The m ranom vector x with ensity (π) m/ e n z i (π) m/ e z 0z, (3.4) f x ( x ) (π) n/ [et(σ)] / e ( x µ ) 0 Σ ( x µ ) (3.5) is sai to be istribute multivariate normal with mean vector µ an positive efinite covariance matrix Σ. Such a istribution for x is enote by x N(µ, Σ). The spherical normal istribution is seen to be a special case where µ 0 an Σ I m. There is a one-to-one relationship between the multivariate normal ranom vector an a spherical normal ranom vector. Let z be an m spherical normal ranom vector an {z} x µ + Az, m where z is efine above, an et(a) 6 0. Then, E x µ + A E z µ, (3.6)
21 3.3. MULTIVARIATE NORMAL DISTRIBUTION since E[z] 0. Also, we know that E( zz 0 )E z z z z z m z z z z z m z m z z m z zm I m, (3.7) since E[z i z j ]0,foralli 6 j, ane[zi ],foralli. Therefore, E[( x µ )( x µ ) 0 ] E( Azz 0 A 0 ) A E( zz 0 )A 0 AI m A 0 Σ, (3.8) where Σ is a positive efinite matrix (since et(a) 6 0). Next, we nee to fin the probability ensity function f x ( x )ofx. Weknow that z A ( x µ ), z 0 (x µ ) 0 A 0, an J z ( z )A, so we use (3.3) to get f x ( x ) f z [ A ( x µ )] et( A ) (π) m/ e ( x µ ) (π ) m/ e ( x µ ) 0 A 0 A ( x µ ) et( A ) 0 ( AA 0 ) ( x µ ) et( A ) (3.9) whereweusetheresults(abc) C B A an A 0 A 0. However, Σ AA 0,soet(Σ) et(a) et(a), an et(a) [et(σ)] /. Thus we can rewrite (3.9) as f x ( x ) (π) m/ [et(σ)] / e ( x µ ) 0 Σ ( x µ ) (3.0) an we see that x N(µ, Σ) withmeanvectorµ an covariance matrix Σ. Since this process is completely reversable the relationship is one-to-one.
22 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS Linear Transformations Theorem 3. Suppose x N(µ, Σ) with et(σ) 6 0an y b + Bx with B square an et(b) 6 0.Theny N(µ y, Σ y ). Proof: From (3.3) an (3.4), we have E y b + Bµ µ y an E[( y µ y )( y µ y ) 0 ]BΣB 0 Σ y. To fin the probability ensity function f y ( y )ofy, we again use (3.3), which gives us f y ( y ) f x [ B ( y b )] et( B ) (π) m/ [et(σ)] / e [ B ( y b ) µ ] 0 Σ [ B ( y b ) µ ] et( B ) (π) m/ [et(bσb 0 )] / e ( y b Bµ ) ( BΣB 0 ) ( y b Bµ ) (π) m/ [et(σ y )] / e ( y µ y ) Σ y ( y µ y ) (3.) So, y N(µ y, Σ y ). Thus we see, as asserte in the previous chapter, that linear transformations of multivariate normal ranom variables are also multivariate normal ranom variables. An any linear combination of inepenent normals will also be normal Quaratic Forms Theorem 3. Let x N( µ, Σ ), whereet(σ) 6 0, then( x µ ) 0 Σ ( x µ ) X m. Proof Let Σ PP 0.Then ( x µ ) N(0, Σ ), an Therefore, z P ( x µ ) N(0, I m ). nx z 0 z zi Xm P ( x µ ) 0 P ( x µ ) (x µ ) 0 P 0 P ( x µ ) (x µ ) 0 Σ ( x µ ) Xm. (3.) Σ is calle the weight matrix. With this result, we can use the X m to make inferences about the mean µ of x.
23 3.4. NORMALITY AND THE SAMPLE MEAN Normality an the Sample Mean 3.4. Moments of the Sample Mean Consier the m vectorx i i.i.. jointly, with m vector mean E[x i ]µ an m m covariance matrix E[(x i µ)(x i µ) 0 ]Σ. Define x n P n n x i as the vector sample mean which is also the vector of scalar sample means. The mean of the vector sample mean follows irectly: E[x n ] nx E[x i ]µ. n Alternatively, this result can be obtaine by applying the scalar results element by element. The secon moment matrix of the vector sample mean is given by E[(x n µ)(x n µ) 0 ] h n E ( P n x i nµ)( P n x i nµ) 0i n E[{(x µ)+(x µ)+... +(x n µ)} n nσ n Σ {(x µ)+(x µ)+... +(x n µ)} 0 ] since the covariances between ifferent observations are zero Distribution of the Sample Mean Suppose x i i.i.. N(µ, Σ) jointly. Then it follows from joint multivariate normality that x n must also be multivariate normal since it is a linear transformation. Specifically, we have or equivalently x n N(µ, n Σ) x n µ N(0, n Σ) n(xn µ) N(0, Σ) nσ / (x n µ) N(0, I m ) where Σ Σ / Σ /0 an Σ / (Σ / ) Multivariate Central Limit Theorem Theorem 3.3 Suppose that (i) x i i.i. jointly, (ii) E[x i ] µ, an (iii) E[(x i µ)(x i µ) 0 ]Σ, then n(xn µ) N(0, Σ)
24 4 CHAPTER 3. MULTIVARIATE DISTRIBUTIONS or equivalently z nσ / (x n µ) N(0, I m ). These results apply even if the original unerlying istribution is not normal an follow irectly from the scalar results applie to any linear combination of x n Limiting Behavior of Quaratic Forms Consier the following quaratic form n (x n µ) 0 Σ (x n µ) n (x n µ) 0 (Σ / Σ /0 ) (x n µ) [n (x n µ) 0 Σ /0 Σ / (x n µ) [ nσ / (x n µ)] 0 [ nσ / (x n µ)] z 0 z χ m. This form is convenient for asymptotic joint test concerning more than one mean at a time. 3.5 Noncentral Distributions 3.5. Noncentral Scalar Normal Definition 3.3 Let x N(µ, σ ). Then, z x σ N( µ/σ, ) (3.3) has a noncentral normal istribution. Example 3.3 When we o a hypothesis test of mean, with known variance, we have, uner the null hypothesis H 0 : µ µ 0, x µ 0 N(0, ) (3.4) σ an, uner the alternative H : µ µ 6 µ 0, x µ 0 σ x µ σ + µ µ 0 σ N(0, )+ µ µ 0 σ µ µ µ 0 N σ,. (3.5) uner the alternative hypothesis follows a non- Thus, the behavior of x µ 0 σ central normal istribution. As this example makes clear, the noncentral normal istribution is especially useful when carefully exploring the behavior of the alternative hypothesis.
25 3.5. NONCENTRAL DISTRIBUTIONS Noncentral t Definition 3.4 Let z N ( µ/σ, ), w χ k,anletz an w be inepenent. Then z p w/k t k ( µ ) (3.6) has a noncentral t istribution. The noncentral t istribution is use in tests of the mean, when the variance is unknown Noncentral Chi-Square Definition 3.5 Let z N (µ, I m ). Then z 0 z Xm( δ ), (3.7) has a noncentral chi-aquare istribution, where δ µ 0 µ is the noncentrality parameter. In the noncentral chi-square istribution, the probability mass is shifte to the right as compare to a regular chi-square istribution. Example 3.4 When we o a test of µ, withknownσ, wehave H 0 : µ µ 0 ( x µ 0 ) 0 Σ ( x µ 0 ) X m (3.8) Let z P (x µ 0 ). Then, we have H : µ µ 6 µ 0 ( x µ 0 ) 0 Σ ( x µ 0 ) (x µ 0 ) 0 P 0 P ( x µ 0 ) z 0 z X m[( µ µ 0 ) 0 Σ ( µ µ 0 )] (3.9) Noncentral F Definition 3.6 Let Y χ m(δ), W χ n, an let Y an W be inepenent ranom variables. Then Y/m W/n F m,n( δ ), (3.30) has a noncentral F istribution, where δ is the noncentrality parameter. The noncentral F istribution is use in tests of mean vectors, where the variance-covariance matrix is unknown an must be estimate.
26 Chapter 4 Asymptotic Theory 4. Convergence Of Ranom Variables 4.. Limits An Orers Of Sequences Definition 4. A sequence of real numbers a,a,...,a n,...,issaitohave a limit of α if for every δ>0, there exists a positive real number N such that for all n>n, a n α <δ. This is enote as lim a n α. n Definition 4. A sequence of real numbers { a n } is sai to be of at most orer n k,anwewrite{ a n } is O( n k ), if where c is any real constant. lim n n k a n c, Example 4. Let { a n } 3+/n, an{ b n } 4 n. Then, { a n } is O( ) O(n 0 ),since lim n n a n 3, an { b n } is O( n ), since lim n n b n-. Definition 4.3 A sequence of real numbers { a n } is sai to be of orer smaller than n k,anwewrite{ a n } is o( n k ), if lim n n k a n 0. 6
27 4.. CONVERGENCE OF RANDOM VARIABLES 7 Example 4. Let { a n } /n. Then, { a n } is o( ), since lim n n 0 a n Convergence In Probability Definition 4.4 A sequence of ranom variables y,y,...,y n,..., with istribution functions F ( ),F ( ),...,F n ( ),...,issaitoconverge weakly in probability to some constant c if lim n c > ]0. n (4.) for every real number >0. Weak convergence in probability is enote by or sometimes, plim y n c, (4.) n y n p c, (4.3) or y n p c. This efinition is equivalent to saying that we have a sequence of tail probabilities (of being greater than c+ or less than c ), an that the tail probabilities approach 0 as n, regarless of how small is chosen. Equivalently, the probability mass of the istribution of y n is collapsing about the point c. Definition 4.5 A sequence of ranom variables y,y,...,y n,...,issaito converge strongly in probability to some constant cif for any real >0. or lim Pr[ sup y n c > ]0, (4.4) N n>n Strong convergence is also calle almost sure convergence an is enote y n a.s. c, (4.5) y n a.s. c. Notice that if a sequence of ranom variables converges strongly in probability, it converges weakly in probability. The ifference between the two is that almost
28 8 CHAPTER 4. ASYMPTOTIC THEORY sure convergence involves an element of uniformity that weak convergence oes not. A sequence that is weakly convergent can have Pr[ y n c > ] wigglewaggle above an below the constant δ use in the limit an then settle own to subsequently be smaller an meet the conition. For strong convergence, once the probability falls below δ for a particular N in the sequence it will subsequently be smaller for all larger N. Definition 4.6 A sequence of ranom variables y,y,...,y n,...,issaito converge in quaratic mean if lim n E[ y n ]c an lim n Var[ y n ]0. By Chebyshev s inequality, convergence in quaratic mean implies weak convergence in probability. For a ranom variable x with mean µ an variance σ, Chebyshev s inequality states Pr( x µ kσ) k. Let σ n enote thevarianceony n, then we can write the conition for the present case as Pr( y n E[y n ] kσ n ) k. Since σ n 0anE[y n ] c the probability will be less than k for sufficiently large n for any choice of k. But this is just weak convergence in probability to c Orers In Probability Definition 4.7 Let y,y,...,y n,... be a sequence of ranom variables. This sequence is sai to be boune in probability if for any >δ>0, there exist a < an some N sufficiently large such that for all n>n. Pr( y n > ) <δ, These conitions require that the tail behavior of the istributions of the sequence not be pathological. Specifically, the tail mass of the istributions cannot be rifting away from zero as we move out in the sequence. Definition 4.8 Thesequenceofranomvariables{ y n } is sai to be at most of orer in probability n λ, an is enote O p ( n λ ), if n λ y n is boune in probability. Example 4.3 Suppose z N(0, ) an y n 3+n z, thenn y n 3/n + z is a boune ranom variable since the first term is asymptotically negligible an we see that y n O p ( n ).
29 4.. CONVERGENCE OF RANDOM VARIABLES 9 Definition 4.9 The sequence of ranom variables { y n } is sai to be of orer in probability smaller than n λ, an is enote o p ( n λ ), if n λ p y n 0. Example 4.4 Convergence in probability can be represente in terms of orer p p in probability. Suppose that y n c or equivalently y n c 0, then n 0 p (y n c) 0any n c o p () Convergence In Distribution Definition 4.0 A sequence of ranom variables y,y,...,y n,...,withcumulative istribution functions F ( ),F ( ),...,F n ( ),...,issaitoconverge in istribution to a ranom variable y with a cumulative istribution function F ( y ), if lim F n( ) F ( ), (4.6) n for every point of continuity of F ( ). The istribution F ( ) issaitobethe limiting istribution of this sequence of ranom variables. For notational convenience, we often write y n F ( ) ory n F ( ) if a sequence of ranom variables converges in istribution to F ( ). Note that the moments of elements of the sequence o not necessarily converge to the moments of the limiting istribution Some Useful Propositions In the following propositions, let x n an y n be sequences of ranom vectors. Proposition 4. If x n y n converges in probability to zero, an y n has a limiting istribution, then x n has a limiting istribution, which is the same. Proposition 4. If y n has a limiting istribution an plim x n 0, then for n z n y 0 n x n, plim z n 0. n Proposition 4.3 Suppose that y n converges in istribution to a ranom variable y, anplim x n c. Thenx 0 n y n converges in istribution to c 0 y. n Proposition 4.4 If g( ) is a continuous function, an if x n y n converges in probability to zero, then g( x n ) g( y n ) converges in probability to zero. Proposition 4.5 If g( ) is a continuous function, an if x n converges in probability to a constant c, thenz n g( x n ) converges in istribution to the constant g( c ).
30 30 CHAPTER 4. ASYMPTOTIC THEORY Proposition 4.6 If g( ) is a continuous function, an if x n converges in istribution to a ranom variable x, thenz n g( x n ) converges in istribution to a ranom variable g( x ). 4. Estimation Theory 4.. Properties Of Estimators Definition 4. An estimator b θ n of the p parameter vector θ is a function of the sample observations x,x,..., x n. It follows that b θ, b θ,..., b θ n form a sequence of ranom variables. Definition 4. The estimator b θ n is sai to be unbiase if E b θ n θ, for all n. Definition 4.3 The estimator b θ n is sai to be asympotically unbiase if lim n Eb θ n θ. Note that an estimator can be biase in finite samples, but asymptotically unbiase. Definition 4.4 The estimator θ b n is sai to be consistent if plimbθ n θ. n Consistency neither implies nor is implie by asymptotic unbiaseness, as emonstrate by the following examples. Example 4.5 Let eθ n ½ θ, with probability /n θ + nc, with probability /n We have E e θ n θ + c, so e θ n is a biase estimator, an lim n E e θ n θ + c, so eθ n is asymptotically biase as well. However, lim n Pr( e θ n θ > )0,so eθ n is a consistent estimator. Example 4.6 Suppose x i i.i.. N(µ, σ )fori,,..., n,... an let ex n x n be an estimator of µ. Now E[ex n ]µ so the estimator is unbiase but Pr( ex n µ >.96σ).05 so the probability mass is not collapsing about the target point µ so the estimator is not consistent.
31 4.. ESTIMATION THEORY Laws Of Large Numbers An Central Limit Theorems Most of the large sample properties of he estimators consiere in the sequel erive from the fact the estimators involve sample averages an the asymptotic behavior of averages is well stuies. In aition to the central limit theorems presente in the previous two chapters we have the following two laws of large numbers: Theorem 4. If x,x,...,x n is a simple ranom sample, that is, the x i s are i.i.., ane x i µ an Var( x i )σ, then by Chebyshev s Inequality, plim x n µ (4.7) n Theorem 4. (Khitchine) Suppose that x,x,...,x n are i.i.. ranom variables, such that for all i,...,n, E x i µ, then, plim x n µ (4.8) n Both of these results apply element-by-element to vectors of estimators. sake of completeness we repeat the following scalar central linit theorem. For Theorem 4.3 (Linberg-Levy) Suppose that x,x,...,x n are i.i.. ranom variables, such that for all i,...,n, E x i µ an Var( x i )σ,then n( xn µ ) N( 0,σ ),or lim n f( x n µ ) πσ e σ ( x n µ ) N(0,σ ) (4.9) This result is easily generalize to obtain the multivariate version given in Theorem 3.3. Theorem 4.4 (Multivariate CLT) Suppose that m ranom vectors x, x,...,x n are (i) jointly i.i.., (ii) E x i µ, an (iii) Cov( x i )Σ, then n( xn µ ) 4..3 CUAN An Efficiency N( 0, Σ ) (4.0) Definition 4.5 An estimator is sai to be consistently uniformly asymptotically normal (CUAN) if it is consistent, an if n( b θ n θ ) converges in istribution to N( 0, Ψ ), an if the convergence is uniform over some compact subset of the parameter space.
32 3 CHAPTER 4. ASYMPTOTIC THEORY Suppose that n( θ b n θ ) converges in istribution to N( 0, Ψ ). Let θ e n be an alternative estimator such that n( θ e n θ ) converges in istribution to N( 0, Ω ). Definition 4.6 If b θ n is CUAN with asymptotic covariance Ψ an e θ n is CUAN with asymptotic covariance Ω, then b θ n is asymptotically efficient relative to e θ n if Ψ Ω is a positive semiefinite matrix. Among other properties asymptotic relative efficiency implies that the iagonal elements of Ψ are no larger than those of Ω, so the asymptotic variances of θ b n,i are no larger than those of θ e n,i. An a similar result applies for the asymptotic variance of any linear combination. Definition 4.7 ACUANestimator b θ n is sai to be asymptotically efficient if it is asymptotically efficient relative to any other CUAN estimator. 4.3 Asymptotic Inference 4.3. Normal Ratios Now, uner the conitions of the central limit theorem, n( xn µ ) N( 0,σ ), so ( x n µ ) p σ /n N( 0, ) (4.) Suppose that c σ is a consistent estimator of σ.thenwealsohave ( x n µ ) p bσ /n p σ /n ( x n µ ) p σ /n p bσ /n r σ ( x n µ ) p bσ σ /n N( 0, ) (4.) since the term uner the square root converges in probability to one an the remainer converges in istribution to N( 0, ). Most typically, such ratios will be use for inference in testing a hypothesis. Now, for H 0 : µ µ 0,wehave ( x n µ 0 ) p bσ /n while uner H : µ µ 6 µ 0,wefin that ( x n µ 0 ) p bσ /n N( 0, ), (4.3) n( xn µ ) bσ + n( µ µ 0 ) bσ N(0, )+O p ( n ) (4.4)
33 4.3. ASYMPTOTIC INFERENCE 33 Thus, extreme values of the ratio can be taken as rare events uner the null or typical events uner the alternative. Such ratios are of interest in estimation an inference with regar to more general parameters. Suppose that θ is the parameter vector n( b θ p θ ) N( 0, Ψ ). p p Then, if θ i is the parameter of particular interest we consier ( b θ i θ i ) p ψii /n an uplicating the arguments for the sample mean ( b θ i θ i ) q bψii/n N( 0, ), (4.5) N( 0, ), (4.6) where Ψ b is a consistent estimator of Ψ. This ratio will have a similar behavior uner a null an alternative hypotesis with regar to θ i Asymptotic Chi-Square Suppose that n( θ b θ ) N( 0, Ψ ), where Ψ b is a consistent estimator of the nonsingular p p matrix Ψ. Then, from the previous chapter we have an n( b θ θ ) 0 Ψ n( b θ θ )n ( b θ θ ) 0 Ψ ( b θ θ ) n ( b θ θ ) 0 b Ψ ( b θ θ ) χ p (4.7) χ p (4.8) for Ψ b a consistent estimator of Ψ. This result can be use to conuct infencebytestingtheentireparmater vector. If H 0 : θ θ 0,then n ( b θ θ 0 ) 0 b Ψ ( b θ θ 0 ) an large positive values are rare events. show (later) χ p, (4.9) while for H : θ θ 6 θ 0,wecan n ( θ b θ 0 ) 0 Ψ b ( θ b θ 0 ) n (( θ b θ )+(θ θ 0 )) 0 Ψ (( θ b θ )+(θ θ 0 )) n ( θ b θ ) 0 Ψ b ( θ b θ )+n ( θ θ 0 ) 0 Ψ ( θ b θ ) +n ( θ θ 0 ) 0 Ψ b ( θ θ 0 ) χ p +O p ( n)+o p (n) (4.0)
34 34 CHAPTER 4. ASYMPTOTIC THEORY Thus,ifweobtainalargevalueofthestatistic,wemaytakeitaseviencethat the null hypothesis is incorrect. This result can also be applie to any subvector of θ. Let µ θ θ, θ where θ is a p vector. Then n( θ b θ ) N( 0, Ψ ), (4.) where Ψ is the upper left-han q q submatrix of Ψ an, n ( θ b θ ) 0 Ψ b ( θ b θ ) χ p (4.) Tests Of General Restrictions We can use similar results to test general nonlinear restrictions. r( ) isaq continuously ifferentiable function, an Suppose that n( b θ θ ) N( 0, Ψ ). By the intermeiate value theorem we can obtain the exact Taylor s series representation r( θ)r(θ)+ b r(θ ) θ 0 ( θ b θ ) or equivalently n(r( θ) b r(θ ) r(θ)) n( θ b θ ) θ 0 R( θ ) n( θ b θ ) where R( θ ) r(θ ) θ 0 an θ lies between b θ an θ. Now b θ p θ so θ p θ an R( θ ) R( θ ). Thus, we have n[ r( b θ ) r( θ )] N( 0, R( θ )ΨR 0 ( θ )). (4.3) Thus, uner H 0 : r( θ ) 0, assuming R( θ )ΨR 0 ( θ ) is nonsingular, we have n r( b θ ) 0 [R( θ )ΨR 0 ( θ )] r( b θ ) χ q, (4.4) where q is the length of r( ). In practice, we substitute the consistent estimates R( b θ )forr( θ )anbψ for Ψ to obtain, following the arguments given above n r( b θ ) 0 [R( b θ ) b ΨR 0 ( b θ )] r( b θ ) χ q, (4.5) The behavior uner the alternative hypothesis will be O p (n) asabove.
35 Chapter 5 Maximum Likelihoo Methos 5. Maximum Likelihoo Estimation (MLE) 5.. Motivation Suppose we have a moel for the ranom variable y i,fori,,...,n,with unknown (p ) parameter vector θ. In many cases, the moel will imply a istribution f( y i, θ ) for each realization of the variable y i. A basic premise of statistical inference is to avoi unlikely or rare moels, forexample,inhypothesistesting. Ifwehavearealizationofastatisticthat excees the critical value then it is a rare event uner the null hypothesis. Uner the alternative hypothesis, however, such a realization is much more likely to occur an we reject the null in favor of the alternative. Thus in choosing between the null an alternative, we select the moel that makes the realization of the statistic more likely to have occure. Carrying this iea over to estimation, we select values of θ such that the corresponing values of f( y i, θ ) are not unlikely. After all, we o not want a moel that isagrees strongly with the ata. Maximum likelihoo estimation is merely a formalization of this notion that the moel chosen shoul not be unlikely. Specifically, we choose the values of the parameters that make the realize ata most likely to have occure. This approach oes, however, require that the moel be specifie in enough etail to imply a istribution for the variable of interest. 35
36 36 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS 5.. The Likelihoo Function Suppose that the ranom variables y,y,...,y n are i.i.. Then, the joint ensity function for n realizations is f( y,y,...,y n θ ) f( y θ ) f( y, θ )... f( y n θ ) ny f( y i θ ) (5.) Given values of the parameter vector θ, this function allows us to assign local probability measures for various choices of the ranom variables y,y,...,y n. This is the function which must be integrate to make probability statements concerning the joint outcomes of y,y,...,y n. Given a set of realize values of the ranom variables, we use this same function to establish the probability measure associate with various choices of the parameter vector θ. Definition 5. The likelihoo function of the parameters, for a particular sample of y,y,...,y n, is the joint ensity function consiere as a function of θ given the y i s. That is, L( θ y,y,...,y n ) ny f( y i θ ) (5.) 5..3 Maximum Likelihoo Estimation For a particular choice of the parameter vector θ, the likelihoo function gives a probability measure for the realizations that occure. Consistent with the approach use in hypothesis testing, an using this function as the metric, we choose θ that make the realizations most likely to have occure. Definition 5. The maximum likelihoo estimator of θ is the estimator obtaine by maximizing L( θ y,y,...,y n ) with respect to θ. Thatis, where b θ is calle the MLE of θ. max L( θ y,y,...,y n ) θ, b (5.3) θ Equivalently, since log( ) is a strictly monotonic transformation, we may fin the MLE of θ by solving max θ L( θ y,y,...,y n ), (5.4) where L( θ y,y,...,y n )logl(θ y,y,...,y n )
37 5.. MAXIMUM LIKELIHOOD ESTIMATION (MLE) 37 is enote the log-likelihoo function. first-orer conitions (FOC) In practice, we obtain b θ by solving the L( θ; y,y,...,y n ) θ nx log f( y i ; b θ ) θ 0. The motivation for using the log-likelihoo function is apparent since the summation form will result, after ivision by n, in estimators that are approximately averages, about which we know a lot. This avantage is particularly clear in the folowing example. Example 5. Suppose that y i i.i..n( µ, σ ), for i,,...,n. Then, f( y i µ, σ ) πσ e σ ( y i µ ), for i,,...,n. Using the likelihoo function (5.), we have L( µ, σ y,y,...,y n ) ny Next, we take the logarithm of (5.5), which gives us log L( µ, σ y,y,...,y n ) nx πσ e σ ( y i µ ). (5.5) log( πσ ) σ ( y i µ ) n log( πσ ) σ nx ( y i µ ) (5.6) We then maximize (5.6) with respect to both µ an σ. That is, we solve the following first orer conitions: (A) (B) log L( ) µ σ P n ( y i µ )0; log L( ) σ n σ + σ 4 P n ( y i µ ) 0. By solving (A), we fin that µ n P n y i y n. Solving (B) gives us σ n P n ( y i y n ). Therefore, bµ y n,anbσ n P n ( y i bµ ). Note that bσ n P n ( y i bµ ) 6 s,wheres n P n ( y i bµ ). s is the familiar unbiase estimator for σ,anbσ is a biase estimator.
38 38 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS 5. Asymptotic Behavior of MLEs 5.. Assumptions For the results we will erive in the following sections, we nee to make five assumptions:. The y i s are ii ranom variables with ensity function f( y i,θ )fori,,...,n;. log f( y i,θ ) an hence f( y i,θ ) possess erivatives with respect to θ up to the thir orer for θ Θ; 3. The range of y i is inepenent of θ hence ifferentiation uner the integral is possible; 4. The parameter vector θ is globally ientifie by the ensity function log f( y i, θ )/ θ i θ j θ k is boune in absolute value by some function H ijk ( y ) for all y an θ θ, which, in turn, has a finite expectation for all θ Θ. The first assumption is funamental an the basis of the estimator. If it is not satisfie then we are misspecifying the moel an there is little hope for obtaining correct inferences, at least in finite samples. The secon assumption is a regularity conition that is usually satisfie an easily verifie. The thir assumption is also easily verifie an guaratee to be satisfie in moels where the epenent variable has smooth an infinite support. The fourth assumption must be verifie, which is easier in some cases than others. The last assumption is crucial an bears a cost an really shoul be verifie before MLE is unertaken but is usually ignore. 5.. Some Preliminaries Now, we know that Z Z L( θ 0 y )y f( y θ 0 )y (5.7)
39 5.. ASYMPTOTIC BEHAVIOR OF MLES 39 foranyvalueofthetrueparametervectorθ 0. Therefore, 0 R L( θ 0 y )y θ Z L( θ 0 y ) y θ Z f( y θ 0 ) y θ Z log f( y θ 0 ) f( y θ 0 )y, (5.8) θ an log f( y θ 0 ) 0 E θ log L( θ 0 y ) E (5.9) θ foranyvalueofthetrueparametervectorθ 0. Differentiating (5.8) again yiels Z log f( y θ 0 ) 0 θ θ 0 f( y θ 0 )+ log f( y θ0 ) f( y θ 0 ) θ θ 0 y. (5.0) Since f( y θ 0 ) θ 0 log f( y θ0 ) θ 0 f( y θ 0 ), (5.) then we can rewrite (5.0) as log f( y θ 0 ) log f( y θ 0 ) log f( y θ 0 ) 0E θ θ 0 + E θ θ 0. (5.) or, in terms of the likelihoo function, log L( θ ϑ( θ 0 0 y ) log L( θ 0 y ) log L( θ 0 y ) )E θ θ 0 E θ θ 0. (5.3) The matrix ϑ( θ 0 ) is calle the information matrix an the relationship given in (5.3) the information matrix equality. Finally, we note that log L( θ 0 y ) E θ log L( θ 0 y ) θ 0 " X n log f i ( y θ 0 ) E θ nx log fi ( y θ 0 ) E θ nx # log f i ( y θ 0 ) θ 0 log f i ( y θ 0 ) θ 0 (5.4),
40 40 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS since the covariances between ifferent observations is zero Asymptotic Properties Consistent Root Exists Consier the case where p. Then, expaning in a Taylor s series an using the intermeiate value theorem on the quaratic term yiels log L( θ y ) n θ n + n nx + n log f( y i θ 0 ) θ nx nx log f( y i θ 0 ) θ ( b θ θ 0 ) 3 log f( y i θ ) θ 3 ( b θ θ 0 ) (5.5) where θ lies between b θ an θ 0. Now, by assumption 5, we have n nx 3 log f( y i θ ) θ 3 k nx H( y i ), (5.6) for some k <. So, where log L( θ y ) aδ + bδ + c, (5.7) n θ δ θ b θ 0, a k nx H( y i ), n b nx log f( y i θ 0 ) n θ, an c nx log f( y i θ 0 ). n θ Note that a P n n H( y i ) E[H( y i )] + o p () O p (), plim c 0,an n plim b ϑ( θ 0 ). n Now, since log L( θ y b )/ θ 0,wehaveaδ + bδ + c 0. Therearetwo possibilities. If a 6 0 with probability, which will occur when the FOC are
41 5.. ASYMPTOTIC BEHAVIOR OF MLES 4 nonlinear in a, then p δ b ± b 4ac. (5.8) a Since ac o p (), then δ 0 for the plus root while δ ϑ( θ 0 )/α for the negative root if plim a α 6 0exists. Ifa 0, then the FOC are linear n in δ whereupon δ c p b an again δ 0. If the F.O.C. are nonlinear but asymptotically linear then a p 0anac in the numerator of (5.8) will still p go to zero faster than a in the enominator an δ 0. Thus there exits at least one consistent solution θ b which satisfies plim( θ b θ 0 ) 0. (5.9) n an in the asymptotically nonlinear case there is also a possibly inconsistent solution. For the case of θ a vector, we can apply a similar style proof to show there exists a solution θ b to the FOC that satisfies plim( θ θ b 0 ) 0. An in the event n of asymptotically nonlinear FOC there is at least on other possibly inconsistent root. Global Maximum Is Consistent In the event of multiple roots, we are left with the problem of selecting between them. By assumption 4, the parameter θ is globally ientifie by the ensity function. Formally, this means that f( y, θ )f( y, θ 0 ), (5.0) for all y implies that θ θ 0.Now, Z f( y,θ ) E f( y, θ 0 ) f( y, θ ) f( y, θ 0 ) f( y,θ0 ). (5.) Thus, by Jensen s Inequality, we have f( y,θ ) f( y, θ ) E log f( y, θ 0 < log E ) f( y,θ 0 0, ) (5.) unless f( y, θ ) f( y, θ 0 ) for all y, or θ θ 0. Therefore, E [log f( y, θ )] achieves a maximum if an only if θ θ 0. However, we are solving max n nx p log f( y i, θ ) E [log f( y i, θ )], (5.3) p
42 4 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS an if we choose an inconsistent root, we will not obtain a global maximum. Thus, asymptotically, the global maximum is a consistent root. This choice of the global root has ae appeal since it is in fact the MLE among the possible alternatives an hence the choice that makes the realize ata most likely to have occure. There are complications in finite samples since the value of the likelihoo function for alternative roots may cross over as the sample size increases. That is the global maximum in small samples may not be the global maximum in larger samples. An ae problem is to ientify all the alternative roots so we can choose the global maximum. Sometimes a solution is available in a simple consistent estimator which may be use to start the nonlinear MLE optimization. Asymptotic Normality For p,wehaveaδ + bδ + c 0,so an δ b θ θ 0 n( b θ θ 0 ) c aδ + b nc a( θ b θ 0 )+b nc. a( θ b θ 0 )+b (5.4) Now since a O p () an b θ θ 0 o p (), then a( b θ θ 0 )o p () an An by the CLT we have nc n a( b θ θ 0 )+b n X log f( y i θ 0 ) θ Substituing these two results in (5.5), we fin n( θ b θ 0 ) N(0,ϑ( θ 0 ) ). p ϑ( θ 0 ). (5.5) N(0,ϑ( θ 0 )). (5.6) In general, for p>, we can apply the same scalar proof to show n( λ 0 θ b λ 0 θ 0 ) N(0, λ 0 ϑ( θ 0 ) λ) for any vector λ, which means n( θ b θ 0 ) N( 0, ϑ ( θ 0 )), (5.7) if b θ is the global maximum.
43 5.. ASYMPTOTIC BEHAVIOR OF MLES 43 Cramér-Rao Lower Boun In aition to being the covariance matrix of the MLE, ϑ ( θ 0 )efines a lower boun for covariance matrices with certain esirable properties. Let θ( e y )be any unbiase estimator, then Z E θ( e y ) eθ( y )f( y; θ )y θ, (5.8) for any unerlying θ θ 0. respect to θ yiels Differentiating both sies of this relationship with Next, we let I p E θ(, e y ) θ 0 Z eθ( y ) f( y; θ0 ) θ 0 y Z eθ( y ) log f( y; θ0 ) θ 0 f( y; θ 0 )y E eθ( y ) log f( y; θ0 ) θ 0 E ( θ( e y ) θ 0 ) log f( y; θ0 ) θ 0. (5.9) h C( θ 0 )E ( θ( e y ) θ 0 )( θ( e i y ) θ 0 ) 0. (5.30) bethecovariancematrixofθ( e y ), then, µ µ eθ( y ) C( θ 0 ) I Cov log L p I θ p ϑ( θ 0 ), (5.3) where I p is a p p ientity matrix, an (5.3) as a covariance matrix is positive semiefinite. Now, for any (p ) vector a, wehave a 0 a 0 ϑ( θ 0 ) µ C( θ 0 ) I p I p ϑ( θ 0 ), µ a 0 a 0 ϑ( θ 0 ) a 0 [C(θ 0 ) ϑ( θ 0 ) ]a 0. (5.3) Thus, any unbiase estimator θ( e y ) has a covariance matrix that excees ϑ( θ 0 ) by a positive semiefinite matrix. An if the MLE estimator is unbiase, it is efficient within the class of unbiase estimators. Likewise, any CUAN estimator will have a covariance exceeing ϑ( θ 0 ). Since the asymptotic covariance of MLE is, in fact, ϑ( θ 0 ),itisefficient (asymptotically). ϑ( θ 0 ) is calle the Camér-Rao lower boun.
44 44 CHAPTER 5. MAXIMUM LIKELIHOOD METHODS 5.3 Maximum Likelihoo Inference 5.3. Likelihoo Ratio Test Suppose we wish to test H 0 : θ θ 0 against H 0 : θ 6 θ 0. Then, we efine an L u max θ L( θ y )L( b θ y ) (5.33) L r L( θ 0 y ), (5.34) where L u is the unrestricte likelihoo an L r is the restricte likelihoo. We then form the likelihoo ratio λ L r. (5.35) L u Note that the restricte likelihoo can be no larger than the unrestricte which maximizes the function. As with estimation, it is more convenient to work with the logs of the likelihoo functions. It will be shown below that, uner H 0, LR logλ log L r L u [L( b θ y ) L( θ 0 y )] χ p, (5.36) where θ b is the unrestricte MLE, an θ 0 is the restricte MLE. If H applies, then LR O p ( n ). Large values of this statistic inicate that the restrictions make the observe values much less likely than the unrestricte an we prefer the unrestricte an reject the restictions. In general, for H 0 : r( θ ) 0, an H : r( θ ) 6 0,wehave an Uner H 0, L u maxl( θ y )L( θ b y ), (5.37) θ L r maxl( θ y )s.t. r( θ )0 θ L( θ e y ). (5.38) LR[L( b θ y ) L( e θ y )] χ q, (5.39) where q is the length of r( ). Note that in the general case, the likelihoo ratio test requires calculation of both the restricte an the unrestricte MLE.
45 5.3. MAXIMUM LIKELIHOOD INFERENCE Wal Test TheasymptoticnormalityofMLEmaybeusetoobtainatestbaseonlyon the unrestricte estimates. Now, uner H 0 : θ θ 0,wehave n( b θ θ 0 ) N(0,ϑ ( θ 0 )). (5.40) Thus, using the results on the asymptotic behavior of quaratic forms from the previous chapter, we have Wn( b θ θ 0 ) 0 ϑ( θ 0 )( b θ θ 0 ) χ p, (5.4) which is the Wal test. As we iscusse for quaratic tests, in general, uner H 0 : θ 6 θ 0,wewoulhaveWO p ( n ). In practice, since we use L( θ y b ) n θ θ 0 nx log f( θ y b ) p n θ θ 0 ϑ( θ 0 ), (5.4) W ( θ b θ 0 ) 0 L( θ y b ) θ θ 0 ( θ b θ 0 ) χ p. (5.43) Asie from having the same asymptotic istribution, the Likelihoo Ratio an Wal tests are asymptotically equivalent in the sense that plim(lr W )0. (5.44) n This is shown by expaning L( θ 0 y ) in a Taylor s series about b θ.thatis, L( θ 0 ) L( θ b )+ L( θ b ) θ ( θ0 θ b ) + ( θ0 θ b ) 0 L( θ b ) θ θ 0 ( θ 0 θ b ) + P P P 3 L( θ ) (θi 0 θ 6 θ i θ j θ b i )(θj 0 θ b j )(θk 0 θ b k ). (5.45) k i j k where the thir line applies the intermeiate value theorem for θ between θ b an θ 0. Now L( θ ) θ 0, an the thir line can be shown to be O p (/ n) uner assumption 5, whereupon we have L( b θ ) L( θ 0 ) ( θ0 b θ ) 0 L( b θ ) θ θ 0 ( θ 0 b θ )+O p (/ n ) (5.46)
Topic 7: Convergence of Random Variables
Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information
More informationSurvey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013
Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing
More informationunder the null hypothesis, the sign test (with continuity correction) rejects H 0 when α n + n 2 2.
Assignment 13 Exercise 8.4 For the hypotheses consiere in Examples 8.12 an 8.13, the sign test is base on the statistic N + = #{i : Z i > 0}. Since 2 n(n + /n 1) N(0, 1) 2 uner the null hypothesis, the
More informationLogarithmic spurious regressions
Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate
More informationTutorial on Maximum Likelyhood Estimation: Parametric Density Estimation
Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing
More informationJointly continuous distributions and the multivariate Normal
Jointly continuous istributions an the multivariate Normal Márton alázs an álint Tóth October 3, 04 This little write-up is part of important founations of probability that were left out of the unit Probability
More informationA Modification of the Jarque-Bera Test. for Normality
Int. J. Contemp. Math. Sciences, Vol. 8, 01, no. 17, 84-85 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1988/ijcms.01.9106 A Moification of the Jarque-Bera Test for Normality Moawa El-Fallah Ab El-Salam
More informationRobust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k
A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine
More informationNew Statistical Test for Quality Control in High Dimension Data Set
International Journal of Applie Engineering Research ISSN 973-456 Volume, Number 6 (7) pp. 64-649 New Statistical Test for Quality Control in High Dimension Data Set Shamshuritawati Sharif, Suzilah Ismail
More informationComputing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions
Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5
More informationNOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,
NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which
More information19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control
19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior
More informationFinal Exam Study Guide and Practice Problems Solutions
Final Exam Stuy Guie an Practice Problems Solutions Note: These problems are just some of the types of problems that might appear on the exam. However, to fully prepare for the exam, in aition to making
More informationTable of Common Derivatives By David Abraham
Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec
More informationJUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson
JUST THE MATHS UNIT NUMBER 10.2 DIFFERENTIATION 2 (Rates of change) by A.J.Hobson 10.2.1 Introuction 10.2.2 Average rates of change 10.2.3 Instantaneous rates of change 10.2.4 Derivatives 10.2.5 Exercises
More informationLECTURE NOTES ON DVORETZKY S THEOREM
LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationNotes on Lie Groups, Lie algebras, and the Exponentiation Map Mitchell Faulk
Notes on Lie Groups, Lie algebras, an the Exponentiation Map Mitchell Faulk 1. Preliminaries. In these notes, we concern ourselves with special objects calle matrix Lie groups an their corresponing Lie
More informationLeast-Squares Regression on Sparse Spaces
Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction
More informationDifferentiation ( , 9.5)
Chapter 2 Differentiation (8.1 8.3, 9.5) 2.1 Rate of Change (8.2.1 5) Recall that the equation of a straight line can be written as y = mx + c, where m is the slope or graient of the line, an c is the
More informationResearch Article When Inflation Causes No Increase in Claim Amounts
Probability an Statistics Volume 2009, Article ID 943926, 10 pages oi:10.1155/2009/943926 Research Article When Inflation Causes No Increase in Claim Amounts Vytaras Brazauskas, 1 Bruce L. Jones, 2 an
More informationEuler equations for multiple integrals
Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................
More informationPure Further Mathematics 1. Revision Notes
Pure Further Mathematics Revision Notes June 20 2 FP JUNE 20 SDB Further Pure Complex Numbers... 3 Definitions an arithmetical operations... 3 Complex conjugate... 3 Properties... 3 Complex number plane,
More informationConsistency and asymptotic normality
Consistency an asymtotic normality Class notes for Econ 842 Robert e Jong March 2006 1 Stochastic convergence The asymtotic theory of minimization estimators relies on various theorems from mathematical
More information'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21
Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting
More informationParameter estimation: A new approach to weighting a priori information
Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a
More information6 General properties of an autonomous system of two first order ODE
6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x
More information. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.
S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial
More informationSeparation of Variables
Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical
More informationA. Incorrect! The letter t does not appear in the expression of the given integral
AP Physics C - Problem Drill 1: The Funamental Theorem of Calculus Question No. 1 of 1 Instruction: (1) Rea the problem statement an answer choices carefully () Work the problems on paper as neee (3) Question
More informationIntroduction to Markov Processes
Introuction to Markov Processes Connexions moule m44014 Zzis law Gustav) Meglicki, Jr Office of the VP for Information Technology Iniana University RCS: Section-2.tex,v 1.24 2012/12/21 18:03:08 gustav
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS
ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS MARK SCHACHNER Abstract. When consiere as an algebraic space, the set of arithmetic functions equippe with the operations of pointwise aition an
More informationImproving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers
International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize
More informationd dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1
Lecture 5 Some ifferentiation rules Trigonometric functions (Relevant section from Stewart, Seventh Eition: Section 3.3) You all know that sin = cos cos = sin. () But have you ever seen a erivation of
More informationIntroduction to the Vlasov-Poisson system
Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its
More informationarxiv: v4 [math.pr] 27 Jul 2016
The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,
More informationA note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz
A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae
More informationLower Bounds for the Smoothed Number of Pareto optimal Solutions
Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.
More informationLinear First-Order Equations
5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)
More informationConvergence of Random Walks
Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of
More informationLecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012
CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration
More informationQF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim
QF101: Quantitative Finance September 5, 2017 Week 3: Derivatives Facilitator: Christopher Ting AY 2017/2018 I recoil with ismay an horror at this lamentable plague of functions which o not have erivatives.
More informationChapter 6: Energy-Momentum Tensors
49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.
More informationTime-of-Arrival Estimation in Non-Line-Of-Sight Environments
2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationA. Exclusive KL View of the MLE
A. Exclusive KL View of the MLE Lets assume a change-of-variable moel p Z z on the ranom variable Z R m, such as the one use in Dinh et al. 2017: z 0 p 0 z 0 an z = ψz 0, where ψ is an invertible function
More informationCalculus in the AP Physics C Course The Derivative
Limits an Derivatives Calculus in the AP Physics C Course The Derivative In physics, the ieas of the rate change of a quantity (along with the slope of a tangent line) an the area uner a curve are essential.
More informationThe Principle of Least Action
Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of
More informationMathematical Review Problems
Fall 6 Louis Scuiero Mathematical Review Problems I. Polynomial Equations an Graphs (Barrante--Chap. ). First egree equation an graph y f() x mx b where m is the slope of the line an b is the line's intercept
More informationPDE Notes, Lecture #11
PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =
More information1 Math 285 Homework Problem List for S2016
1 Math 85 Homework Problem List for S016 Note: solutions to Lawler Problems will appear after all of the Lecture Note Solutions. 1.1 Homework 1. Due Friay, April 8, 016 Look at from lecture note exercises:
More informationREAL ANALYSIS I HOMEWORK 5
REAL ANALYSIS I HOMEWORK 5 CİHAN BAHRAN The questions are from Stein an Shakarchi s text, Chapter 3. 1. Suppose ϕ is an integrable function on R with R ϕ(x)x = 1. Let K δ(x) = δ ϕ(x/δ), δ > 0. (a) Prove
More informationLectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs
Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent
More informationCapacity Analysis of MIMO Systems with Unknown Channel State Information
Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,
More informationSection 7.2. The Calculus of Complex Functions
Section 7.2 The Calculus of Complex Functions In this section we will iscuss limits, continuity, ifferentiation, Taylor series in the context of functions which take on complex values. Moreover, we will
More information3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes
Fin these erivatives of these functions: y.7 Implicit Differentiation -- A Brief Introuction -- Stuent Notes tan y sin tan = sin y e = e = Write the inverses of these functions: y tan y sin How woul we
More information1. Aufgabenblatt zur Vorlesung Probability Theory
24.10.17 1. Aufgabenblatt zur Vorlesung By (Ω, A, P ) we always enote the unerlying probability space, unless state otherwise. 1. Let r > 0, an efine f(x) = 1 [0, [ (x) exp( r x), x R. a) Show that p f
More informationMath 1271 Solutions for Fall 2005 Final Exam
Math 7 Solutions for Fall 5 Final Eam ) Since the equation + y = e y cannot be rearrange algebraically in orer to write y as an eplicit function of, we must instea ifferentiate this relation implicitly
More informationOn conditional moments of high-dimensional random vectors given lower-dimensional projections
Submitte to the Bernoulli arxiv:1405.2183v2 [math.st] 6 Sep 2016 On conitional moments of high-imensional ranom vectors given lower-imensional projections LUKAS STEINBERGER an HANNES LEEB Department of
More informationEconometrics I. September, Part I. Department of Economics Stanford University
Econometrics I Deartment of Economics Stanfor University Setember, 2008 Part I Samling an Data Poulation an Samle. ineenent an ientical samling. (i.i..) Samling with relacement. aroximates samling without
More informationA Review of Multiple Try MCMC algorithms for Signal Processing
A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications
More informationTHE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE
Journal of Soun an Vibration (1996) 191(3), 397 414 THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE E. M. WEINSTEIN Galaxy Scientific Corporation, 2500 English Creek
More informationSome Examples. Uniform motion. Poisson processes on the real line
Some Examples Our immeiate goal is to see some examples of Lévy processes, an/or infinitely-ivisible laws on. Uniform motion Choose an fix a nonranom an efine X := for all (1) Then, {X } is a [nonranom]
More informationLecture 5. Symmetric Shearer s Lemma
Stanfor University Spring 208 Math 233: Non-constructive methos in combinatorics Instructor: Jan Vonrák Lecture ate: January 23, 208 Original scribe: Erik Bates Lecture 5 Symmetric Shearer s Lemma Here
More informationA Weak First Digit Law for a Class of Sequences
International Mathematical Forum, Vol. 11, 2016, no. 15, 67-702 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1288/imf.2016.6562 A Weak First Digit Law for a Class of Sequences M. A. Nyblom School of
More informationMath Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors
Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+
More informationThe Exact Form and General Integrating Factors
7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily
More informationImplicit Differentiation
Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,
More informationApplications of the Wronskian to ordinary linear differential equations
Physics 116C Fall 2011 Applications of the Wronskian to orinary linear ifferential equations Consier a of n continuous functions y i (x) [i = 1,2,3,...,n], each of which is ifferentiable at least n times.
More informationELEC3114 Control Systems 1
ELEC34 Control Systems Linear Systems - Moelling - Some Issues Session 2, 2007 Introuction Linear systems may be represente in a number of ifferent ways. Figure shows the relationship between various representations.
More informationFLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction
FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number
More informationZachary Scherr Math 503 HW 3 Due Friday, Feb 12
Zachary Scherr Math 503 HW 3 Due Friay, Feb 1 1 Reaing 1. Rea sections 7.5, 7.6, 8.1 of Dummit an Foote Problems 1. DF 7.5. Solution: This problem is trivial knowing how to work with universal properties.
More informationSpurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009
Spurious Significance of reatment Effects in Overfitte Fixe Effect Moels Albrecht Ritschl LSE an CEPR March 2009 Introuction Evaluating subsample means across groups an time perios is common in panel stuies
More informationWitt#5: Around the integrality criterion 9.93 [version 1.1 (21 April 2013), not completed, not proofread]
Witt vectors. Part 1 Michiel Hazewinkel Sienotes by Darij Grinberg Witt#5: Aroun the integrality criterion 9.93 [version 1.1 21 April 2013, not complete, not proofrea In [1, section 9.93, Hazewinkel states
More informationMath 342 Partial Differential Equations «Viktor Grigoryan
Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite
More informationThe derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)
Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)
More informationOptimization of Geometries by Energy Minimization
Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.
More informationSome functions and their derivatives
Chapter Some functions an their erivatives. Derivative of x n for integer n Recall, from eqn (.6), for y = f (x), Also recall that, for integer n, Hence, if y = x n then y x = lim δx 0 (a + b) n = a n
More informationMath 115 Section 018 Course Note
Course Note 1 General Functions Definition 1.1. A function is a rule that takes certain numbers as inputs an assigns to each a efinite output number. The set of all input numbers is calle the omain of
More informationII. First variation of functionals
II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent
More informationProof of SPNs as Mixture of Trees
A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a
More information0.1 Differentiation Rules
0.1 Differentiation Rules From our previous work we ve seen tat it can be quite a task to calculate te erivative of an arbitrary function. Just working wit a secon-orer polynomial tings get pretty complicate
More informationMath 1B, lecture 8: Integration by parts
Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores
More informationTwo formulas for the Euler ϕ-function
Two formulas for the Euler ϕ-function Robert Frieman A multiplication formula for ϕ(n) The first formula we want to prove is the following: Theorem 1. If n 1 an n 2 are relatively prime positive integers,
More informationModeling of Dependence Structures in Risk Management and Solvency
Moeling of Depenence Structures in Risk Management an Solvency University of California, Santa Barbara 0. August 007 Doreen Straßburger Structure. Risk Measurement uner Solvency II. Copulas 3. Depenent
More informationCalculus and optimization
Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function
More informationTAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS
MISN-0-4 TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS f(x ± ) = f(x) ± f ' (x) + f '' (x) 2 ±... 1! 2! = 1.000 ± 0.100 + 0.005 ±... TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS by Peter Signell 1.
More information4. Distributions of Functions of Random Variables
4. Distributions of Functions of Random Variables Setup: Consider as given the joint distribution of X 1,..., X n (i.e. consider as given f X1,...,X n and F X1,...,X n ) Consider k functions g 1 : R n
More informationRelative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation
Relative Entropy an Score Function: New Information Estimation Relationships through Arbitrary Aitive Perturbation Dongning Guo Department of Electrical Engineering & Computer Science Northwestern University
More informationLeChatelier Dynamics
LeChatelier Dynamics Robert Gilmore Physics Department, Drexel University, Philaelphia, Pennsylvania 1914, USA (Date: June 12, 28, Levine Birthay Party: To be submitte.) Dynamics of the relaxation of a
More informationConservation Laws. Chapter Conservation of Energy
20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action
More informationBasic Thermoelasticity
Basic hermoelasticity Biswajit Banerjee November 15, 2006 Contents 1 Governing Equations 1 1.1 Balance Laws.............................................. 2 1.2 he Clausius-Duhem Inequality....................................
More informationA Sketch of Menshikov s Theorem
A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationCalculus of Variations
Calculus of Variations Lagrangian formalism is the main tool of theoretical classical mechanics. Calculus of Variations is a part of Mathematics which Lagrangian formalism is base on. In this section,
More informationMath 300 Winter 2011 Advanced Boundary Value Problems I. Bessel s Equation and Bessel Functions
Math 3 Winter 2 Avance Bounary Value Problems I Bessel s Equation an Bessel Functions Department of Mathematical an Statistical Sciences University of Alberta Bessel s Equation an Bessel Functions We use
More informationInternational Journal of Pure and Applied Mathematics Volume 35 No , ON PYTHAGOREAN QUADRUPLES Edray Goins 1, Alain Togbé 2
International Journal of Pure an Applie Mathematics Volume 35 No. 3 007, 365-374 ON PYTHAGOREAN QUADRUPLES Eray Goins 1, Alain Togbé 1 Department of Mathematics Purue University 150 North University Street,
More informationThe Delta Method and Applications
Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order
More informationCalculus of Variations
16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t
More information