Changing sets of parameters by partial mappings Michael Wiedenbeck and Nanny Wermuth

Size: px
Start display at page:

Download "Changing sets of parameters by partial mappings Michael Wiedenbeck and Nanny Wermuth"

Transcription

1 Changing sets of parameters by partial mappings Michael Wiedenbeck and Nanny Wermuth Center of Survey Research, Mannheim, Germany and Department of Mathematical Statistics, Chalmers/Göteborgs Universitet, Sweden Abstract: Changes between different sets of parameters are needed in multivariate statistical modeling. There may, for instance, be specific inference questions based on subject matter interpretations, alternative well-fitting constrained models, compatibility judgements of seemingly distinct constrained models, and different reference priors under alternative parameterizations. We introduce and discuss a partial mapping, called partial replication, and derive from it a more complex mapping, called partial inversion. Both operations are shown to be tools for decomposing matrix operations, for explaining recursion relations among sets of linear parameters, for deriving effects of omitted variables in alternative model formulations, for changing between different types of linear chain graph models, for approximating maximum-likelihood estimates in exponential family models under independence constraints, and for switching partially between sets of canonical and moment parameters in distributions of the exponential family or between sets of corresponding maximum-likelihood estimates. Key words and phrases: Exponential family, graphical Markov models, independence constraints, partial replication, partial inversion, reduced model estimates, REML-estimates, sandwich estimates, structural equations. 1. Definitions and properties of two operators 1.1. Partial inversion We start with a linear function connecting two real valued column vectors y and x via a square matrix M of dimension d for which all principal submatrices are invertible, My = x. (1.1) The index sets of rows and columns of M coincide and are ordered as V = (1,..., d). For an arbitrary subset a of V we consider first a split of V as (a, b) with b = V \ a. A linear operator for equation (1.1), called partial inversion, may be used to invert M in a sequence of steps and forms a starting point for proving properties of graphical 1

2 Markov models, see Wermuth and Cox (2004), Wermuth, Wiedenbeck and Cox (2006), and Marchetti and Wermuth (2007). Partial inversion is a minimally changed version of Gram-Schmidt orthogonalisation and of the sweep operator, the changes being such that an operator with attractive features results. After partial inversion applied to rows and columns a of M, denoted by inv a M, the argument and image in equation (1.1), relating to a, are exchanged ( ) ( ) ( x a y a inv a M =, inv a M = y b x b M 1 aa M ba M 1 aa ) Maa 1M ab M bb.a. (1.2) The matrix M bb.a = M bb M ba Maa 1 M ab is often called the Schur complement of M bb, after Issai Schur ( ), and Maa 1 denotes the inverse of the submatrix M aa of M. Partial inversion with respect to b applied to inv a M is denoted in several equivalent ways depending on the context inv b (inv a M) = inv b inv a M = inv V M = inv ab M, where inv V M yields the inverse of M. Some of the basic properties of partial inversion are (i) inv a inv a M = M, (ii) (inv a M) 1 = inv b M, (1.3) (iii) inv a M = inv b (M 1 ). The operator is commutative, can be undone, and gives the same result if applied with respect to an index set within a submatrix of M than if applied to M and then selecting the submatrix. More precisely, for three disjoint subsets α, β, γ of V, with a = α β (i) inv α inv β M = inv β inv α M, (ii) inv αβ inv βγ M = inv αγ M, (1.4) (iii) inv α M aa = [inv α M] a,a. Indeed all properties, given so far, result with equations (1.8) below, where partial inversion is expressed in terms of what will below be defined as partial replication. Further properties, not used in this paper, may be derived by viewing partial inversion as a group isomorphism. From this point of view, partial inversion operates on the power 2

3 set of V which has group structure with the symmetric difference, denoted by, as law of composition. The neutral element is the empty set and the inverse of a subset a of V is a itself. Then, the image space of partial inversion is the set of transformations {inv a, a V }, a group by the composition of transformations with the identity as neutral element. Again, for every subset a of V, inv a is the inverse of itself. Both groups are Abelian Partial replication We now introduce another operator for equation (1.1), called partial replication, which represents a partial mapping. After partial replication applied to rows and columns a of M, denoted by rep a M, the argument relating to a and denoted by y a, is replicated while the relation for the argument of b = V \ a is preserved as in equation (1.1) ( ) ( ) ( ) y a y a I aa 0 rep a M =, rep y b x a M = ab. (1.5) b M ba M bb Partial replication with respect to b applied to rep a M is denoted in a way analogous to partial inversion rep b (rep a M) = rep b rep a M = rep V M = rep ab M, where rep V M yields the identity matrix I. A basic property of partial replication is rep a rep a M = rep a M. (1.6) The following matrix forms relate directly to components of inv b M (i) (rep a M) 1 = ( (ii) M(rep a M) 1 = I aa M 1 bb M ba ( M aa.b 0 ab M 1 bb ) M ab M 1 bb 0 ba I bb and the last matrix product has been used in the numerical analytical technique of block Gaussian elimination as one special form in which the Schur complement M aa.b of M aa occurs. The operator is commutative, but cannot be undone. It gives the same result as when applied with respect to an index set within a submatrix of M as when applied to M and, ), 3

4 then selecting the submatrix. For three disjoint subsets α, β, γ of V and a = α β (i) rep α rep β M = rep β rep α M, (ii) rep αβ rep βγ M = rep αβγ M, (1.7) (iii) rep α M aa = [rep α M] a,a Thus, it shares commutativity (i) and exchangeability (iii) with partial inversion, but differs regarding property (ii). Nevertheless, partial inversion can be expressed in terms of partial replication. By direct computation and with property (1.3)(ii) inv b M = (rep b M)(rep a M) 1, inv a M = (rep a M)(rep b M) 1. (1.8) 1.3. Partial inversion combined with partial replication Basic properties of the two operators combined are (i) inv a rep a M = rep a M, (ii) inv b rep a M = (rep a M) 1 = rep a inv b M (1.9) (iii) rep b inv b M = M(rep a M) 1 For a partition of V into α, β, γ, δ and s = V \ δ, some of the derived properties are (i) inv α rep β M = rep β inv α M, (ii) inv αβ rep βγ M = inv α rep βγ M, (1.10) (iii) rep α inv β M ss = [rep α inv β M] s,s. Thus, the order of combining the operators matters unless applied to disjoint index sets and for partial inversion applied after partial replication, overlapping index sets are ignored, a contraction instead of the expansion property (1.7)(ii) of partial replication. Partial replication applied after partial inversion and to an overlapping index set coincides, just like (1.8), with a matrix product involving two partially replicated matrices rep βγ inv αβ M = (rep αγ M)(rep γδ M) 1. (1.11) For the first component on the right-hand side, partial replication is with respect to the symmetric difference of c = β γ and a = α β. The complement to the argument of 4

5 partial inversion, b = V \a = γ δ, defines the partial replication in the second component. Conversely, given the partially replicated matrices on the right-hand side, partial inversion of M on the left-hand side is with respect to V \ (γ δ) = a and the result is partially replicated with respect to β γ, the symmetric difference of a and a c. Proof. For the proof of equation (1.11), we obtain vectors from the basic equation (1.1) and partitioned in the order (α, β, γ, δ), with as before a = α β and b = γ δ, by y α y y = (y a ; y b ) = β y γ, (x x a; y b ) = β y γ, (x x βδ; y αγ ) = β y γ. y δ y δ x δ From the definitions of partial replication (1.5) and partial inversion (1.2) we use for (1.11) x α rep b My = (x a ; y b ), inv a M(x a ; y b ) = (x b ; y a ) and let N = inv a M. Partial replication of N with respect to c = β γ modifies the vector (x a ; y b ) into (x βδ ; y αγ ) so that (rep c N) rep b My = (x βδ ; y αγ ) and rep αγ My = (x βδ ; y αγ ), thus completing the proof. It is instructive to see also a direct matrix proof. For this, we denote the matrix obtained by partial replication with respect to a c by Q and with respect to b by R I αα 0 αβ 0 αγ 0 αδ M αα M αβ M αγ M αδ M rep a c M = βα M ββ M βγ M βδ 0 γα 0 γβ I γγ 0 γδ, rep M b M = βα M ββ M βγ M βδ 0 γα 0 γβ I γγ 0 γδ. M δα M δβ M δγ M δδ 0 δα 0 δβ 0 δγ I δδ For equation (1.11) to hold, one needs to show that QR 1 = rep c N, where R 1 = ( M 1 aa ) Maa 1 M ab 0 ba I bb = ( N aa N ab 0 ββ I ββ The rows of components α and γ in QR 1 coincide directly with those of rep c N. The 5 ). y α

6 rows of component β in rep c N result since for Q and R, the rows of β coincide so that the product Q βv R 1 gives zeros whenever the row index within β for Q βv differs from the column index in R 1 and equals one, otherwise. Finally, for the rows of components δ we have for Q δv R 1 by the defining equation for partial inversion (1.2) (M δa M 1 aa M δγ M δa M 1 aa M aγ M δδ M δa M 1 aa M aδ) = (N δa N δb ), which completes the proof. The following use of equation (1.11) represents a change of basis from (x a ; y b ) to (x d ; y c ) rep a c N(x a ; y b ) = (x d ; y c ) = inv c M(x c ; y d ). (1.12) Proof. For equation (1.12), note that, by the definition of partial replication (1.5) with respect to a c = α γ, components x α and y γ of (x a ; y b ) are replicated. By the definition of partial inversion of M with respect to a (1.2), one knows further that N βa x a + N βb y b = y β, N δa x a + N δb y b = x δ. In the following sections, partial replication is applied to quite distinct statistical themes. In section 2, we simplify proofs of a number of results known for different aspects of linear models. In sections 3 and 4, linear relations are obtained for sets of parameters and for sets of estimates in nonlinear models within the exponential family of distribution. 2. Decompositions of partial inversion 2.1. A partially inverted covariance matrix Partial replication provides with (1.8) a decomposition of partial inversion. Let Σ denote the joint covariance matrix of mean-centred random vector variables Y a, Y b, then for instance ( ) ( )( ) Σaa b Π a b Iaa 0 ab inv b Σ = gives with Π T a b Σ 1 bb = rep b Σ (rep a Σ) 1 = Σ aa b = Σ aa Σ ab Σ 1 bb Σ ba, Σ aa Σ ab 0 ba I bb Π a b = Σ ab Σ 1 bb, Σ 1 bb Σ ba Σ 1 bb, a one-to-one transformation of marginal covariances and variances, namely the elements of Σ, into parameters in the concentration matrix Σ 1 bb of Y b and into linear least-squares 6 Σ 1 bb

7 regression parameters of Y a on Y b, i.e. in the model defined by equations Y a Π a b Y b = ǫ a, E(ǫ a ) = 0, cov(ǫ a, Y b ) = 0, cov(ǫ a ) = Σ aa b, where the interpretation of Π a b as a matrix of regression coefficients results by post multiplication with Yb T and taking expectations. Zero constraints on (Π a b, Σ aa b ) have been studied especially in econometrics. The possible existence of multiple solutions of estimating equations, derived by maximizing the Gaussian likelihood function, in a Zellner model, also called seemingly unrelated regressions (Zellner, 1962), has more recently been demonstrated by Drton and Richardson (2004). Zero constraints on concentrations, such as in Σ 1 bb, had been introduced as a tool for parsimonious estimation of covariances, called covariance selection (Dempster, 1972). The Gaussian likelihood function has a unique maximum for all possible sets of zero concentrations though for some models it is necessary to use iterative algorithms to find the solution. Zero constraints on covariances, such as in Σ aa b, had been studied by Anderson (1969, 1973). When zero constraints on Σ 1 are preserved for all variable pairs under the transformation from inv V Σ 1 = Σ, is one typical problem of independence equivalence, studied within the field of graphical Markov models. Similarly, one investigates in this context when zero constraints on Σ 1 are preserved in inv a Σ 1 = inv b Σ Three recursion relations among linear parameter matrices Let a set of variables be partitioned according to V = (a, γ, δ), again with b = γ δ, but a = α and let an estimate of inv b Σ be given, so that implicitly Y a is regressed on Y b. Assume that inv δ Σ has been estimated in another study. Then, which systematic changes are to be expected among the two sets parameters and the corresponding estimates? Written explicitly, the two sets of parameters are inv b Σ = Σ aa b Π a γ.δ Π a δ.γ Σ γγ.a Σ γδ.a Σ δδ.a, inv δ Σ = Σ aa δ Σ aγ δ Π a δ Σ γγ δ Π γ δ Σ δδ.aγ where indicates an entry in a symmetric matrix and an entry in a matrix that is symmetric except for the sign. The matrices Σ aa b, Π a b are as defined before, but the latter is split into two components corresponding to the explanatory variables Y γ, Y δ as Π a b = (Π a γ.δ Π a δ.γ ) and, 7

8 Σ γγ.a denotes the submatrix of components γ in the concentration matrix of Y b, that is after having marginalized in the overall concentration matrix Σ 1 over Y a. Since inv δ Σ = inv γa Σ 1, one knows from (1.8) that inv δ Σ = (rep γ inv b Σ)(rep aδ inv b Σ) 1, or Σ aa b Π a γ.δ Π a δ.γ I aa 0 aγ 0 aδ inv δ Σ = 0 γa I γγ 0 γδ Σ γa δ Σ γγ δ Π γ δ (2.1) Π T a δ.γ Σ δγ.a Σ δδ.a 0 δa 0 δγ I δδ by using the following implications of (1.3)(iii) Σ γγ δ = (Σ γγ.a ) 1, Π γ δ = (Σ γγ.a ) 1 Σ γδ.a Σ γγ δ Π T a γ.δ = Σ γa δ. By the matrix product (2.1), one obtains directly several known recursion relations, one for covariances (Anderson, 1958, section 2.5) in position (a, a) Σ aa δ = Σ aa γδ + Π a γ.δ Σ γa δ, for concentrations (Dempster, 1969, chapter 4) in position (δ, δ) Σ δδ.aγ = Σ δδ.a + Σ γδ.a Π γ δ, and for linear least-squares regression coefficients (Cochran, 1938) in position (a, δ) Π a δ = Π a δ.γ + Π a γ.δ Π γ δ. (2.2) Each of the three equations relates a marginal to a conditional parameter matrix. There will be no change in each of the three types of parameter matrices if and only if at least one of the terms in the added matrix products is zero. For the regression coefficient matrices in (2.2), this can be achieved by study design. If for instance Y δ represent treatments, Y γ factors influencing treatments, and there is randomized allocation of individuals to treatments, then the resulting independence of Y δ and Y γ implies Π γ δ = 0. The same result is achieved by an intervention in which all levels y γ of variable Y γ can be fixed. By contrast, an only notional intervention is assumed in literature on causal models (Rubin 1974; Pearl, 2000; Lauritzen, 2000) to describe the special situation in which overall effects (Π a δ y δ ) coincide with conditional effects (Π a δ.γ y δ ). For a more detailed discussion of assumptions used in the literature on causal models, 8

9 especially of the assumed partial order, here V = (a, γ, δ), see Lindley (2002), Cox and Wermuth (2004), for extensions of the recursion relation for regression coefficients to nonlinear relations, see Cox and Wermuth (2003), Ma, Xie and Geng (2006), and Cox (2007), and for conditions under which other types of association measures remain unchanged, see Xie, Ma and Geng (2007). Recursion relations are important for comparisons of results of any two empirical studies in which similar research questions are investigated on a core set of common variables but that differ with respect to some variables that are explicitly included in one study but not in the other Reduced form equations and omitted variables Suppose that, on subject matter grounds, a full ordering of variables can be given to mean-centred random variables Y i, for i = 1,...,d, and that linear equations are generated which have uncorrelated residuals. Then this system can be described with the random vector Y, in which element i corresponds to Y i and AY = ε, cov(ε) = diagonal, (2.3) where A is a unit upper-triangular matrix for which the negative element in position (i, j) is a least-squares regression coefficient A ij = β i j.r(i)\j, with r(i) = (i + 1,...,d). Let there be some zero constraints on the upper off-diagonal elements of A and the effects of the generating process be of interest on the joint dependency of each component of Y a on Y b where a = 1,...d a, and b = d a + 1,...,d; and then we call V = (a, b) an order-compatible split of V = (1,...,d). From ( ) ) A = A aa A ab 0 ba A bb, inv a A = ( A 1 aa A 1 aa A ab 0 ba A bb and the mapping inv a A ( ε a Y b ) ( ) Y a =, ε b the resulting equation in Y a is Y a = A 1 aa A aby b + η a, η a = A 1 aa ε a. (2.4) 9

10 In the context of structural equations, this type of equation has been called a reduced form equation. For it, the starting equation (2.3) implies cov(η a, Y b ) = 0 so that each component Y i of Y a is related by a linear least-squares regression to Y b and Π a b = A 1 aa A ab, Σ aa b = A 1 aa aaa T aa. The equations imply that, typically, some zero constraints of the generating equations (2.3) will carry over directly to the reduced form equations (2.4) while others take on a more complex form than zero constraints. For more general types of structural equations than those of a triangular system (2.3) questions of identifiability of parameters arise and early results (Fischer, 1966) provided answers for models in which the residual covariance matrix in the reduced form equation is unconstrained, see also Eusebi (2007), or in which it is still a diagonal matrix. Conditions under which such assumptions are satisfied are provided by more recent results on Markov equivalence of models. Let now a = (α, β) be a further order-compatible split of a and Y β be unavailable in a new study. Which effects are to be expected for the dependence of Y α on Y b? By equation (1.8) inv β A = (rep β A)(rep αb A) 1, which gives A αα A αβ A αb I αα 0 αβ 0 αb 0 βα I ββ 0 βb 0 βα A 1 ββ -A 1 0 bα 0 bβ A bb 0 bα 0 bβ I bb ββ A βb = Thus, the equation for Y α in (2.3) is changed via (1.2) to A αα A αβ A 1 ββ A αb.β -A 1 ββ A βb 0 βα A 1 ββ 0 bα 0 bβ A bb. A αα Y α + A αb.β Y b = ξ a, ξ a = ε α A αβ A 1 ββ ε β, cov(ξ a ) = K, with residuals for some component Y i of Y α typically correlated with regressors for Y i so that the residual covariance matrix, K, is no longer diagonal. Then, the equations parameters A αα, A αb.β need not define least-squares regression coefficients in the variables Y α, Y b so that, in particular, a zero equation parameter for Y i, Y j need not correspond to Y i and Y j being uncorrelated given any subset of the remaining variables. In econometrics, one speaks then of an endogenous response variable Y i in a structural equation and of the endogeneity problem. The change in the dependence of Y α on Y b after ignoring the intermediate variable Y β may be large whenever A αb deviates much from A αb.β. There is no change if in the starting system (2.3) either A αβ = 0, i.e. the final response Y a is unaffected by the intermediate 10

11 variable, or A βb = 0, i.e. the intermediate variable does not depend on Y b. By contrast, the reduced form equations after omitting Y β, Y α = A 1 αα A αb.βy b + η α, are the subset of equations (2.4) corresponding to Y α with Π α b = A 1 ααa αb.β. Thus, the structural equations after omitting Y β make also the changes explicit in the dependence of Y α on Y b, when moving from least squares coefficients in (2.3) to the least squares coefficients in reduced form equations (2.4). It may also be verified for the given order V = (α, β, b), that marginalizing over Y α has no effect on the remaining parameters in (2.3) but that marginalizing over variables Y b leads, in general, to a non-diagonal residual covariance matrix K, say. As a consequence of a non-diagonal K, not only the interpretation of equation parameters as least-squares regression coefficients in the observed variables may be lost, but an identification problem may result also. However, a large identified subclass has been characterized recently, see Brito and Pearl (2002), Wermuth and Cox (2007), Drton, Eichler and Richardson (2007). It has K ij 0 only if A ij = 0. Expressed differently, the two sets of variable pairs with constraints in A and K may overlap, but the two unconstrained sets of variable pairs within A and within K are disjoint. This class of models helps to detect a type of confounding called indirect that, so far, had essentially been overlooked, see Wermuth and Cox (2007). Indirect confounding poses a problem even for intervention studies that use randomized allocation of treatments to individuals, i.e. for what is regarded as the gold standard of designed studies Changing to a different split of three types of linear parameters Let V = (α, β, γ, δ), again with a = α β, b = γ δ and another split of V be c = β γ and d = α δ and suppose that inv b Σ has been studied and estimated in one study, but inv d Σ in another. Then the change is between the following two sets of parameters Σ αα b Σ αβ b Π α γ.δ Π Σ αα.c Π T α δ.γ β α.δ ΠT γ α.δ Σαδ.c Σ ββ b Π β γ.δ Π β δ.γ inv b Σ = Σ γγ.a Σ γδ.a, inv d Σ = Σ ββ d Σ βγ d Π β δ.α Σ γγ.d Π. γ δ.α Σ δδ.a Σ δδ.c If we take E(Y ) = 0 and let Z = Σ 1 Y (2.5) 11

12 then the implicit corresponding change in linear systems is between Z α Y α inv a Σ 1 Z β Y γ = Y β Z γ, Y δ Z δ. Y α Z α inv c Σ 1 Z β Z γ = Y β Y γ. (2.6) Y δ Z δ Proof. For equation (2.6), we start with Σ 1 Y = Z in mean-centred Y, let σ ij denote the overall concentration of Y i and Y j, and Y a b = Y a Π a b Y b variable Y a corrected for linear least-squares prediction on Y b. Then E(ZZ T ) = Σ 1, E(Y Z T ) = I, so that Y i corrected for least-squares regression on all remaining variables is Y i j i = Y i + j i (σij /σ ii )Y j = Z i /σ ii. In matrix notation, with D 1 a diagonal matrix having elements 1/σ ii = σ ii j i and U = D 1 Z, the equation Σ 1 Y = Z is modified first into D 1 Σ 1 Y = U and, after partial inversion of D 1 Σ 1 with respect to a, into the left equation in (2.6), i.e. into inv a Σ 1 (Z a ; Y b ) = (Z b ; Y a ). (2.7) For the final step of the proof of (2.6), we use the following result. For any square matrix M, defined as before, and for any a positive definite diagonal matrix D inv a (D M) = (rep a D )(inv a M)(rep b D ) 1. Equation (2.7) can be rewritten as Y a b = Σ aa b Z a, Y b = Σ ba Z a + Σ bb Z b (2.8) so that e.g. linear independence of Y a b and Y b in the given form is verified directly with cov(σ ba Z a + Σ bb Z b, Z a ) = Σ ba Σ aa + Σ bb Σ ba = 0. 12

13 After writing (2.8) as Y a Π a b Y b = ǫ a, cov(ǫ a ) = Σ aa b, Σ bb.a Y b = Z b Π T a b Z a, (2.9) the change in parameter sets for equations (2.6) is seen to be from (Π a b, Σ aa b, Σ bb.a ) to (Π c d, Σ cc d, Σ dd.c ), and the change in the equations (2.6) is defined by the following three reversible transformations inv c Σ 1 = inv a c inv a Σ 1, (Z c ; Y d ) = (rep a c inv a Σ 1 )(Z a ; Y b ) (2.10) (Z d ; Y c ) = (rep a c inv a Σ 1 )(Z a ; Y b ). Proof. The transformations in (2.10) result by equations (1.8) and (1.12) and they lead, again with (1.8), to inv c Σ 1 (Z c ; Y d ) defined in terms of (rep a c inv a Σ 1 )(Z a ; Y b ) Changing from concentrations to parameters of linear chain graphs Parameter sets in linear multivariate regression chains are obtained when, starting with equations (2.5), (2.9), the equations on the right of (2.9), involving the marginalized concentration matrix, are repeatedly modified in the same way as Σ 1 Y = Z is transformed into equations (2.9). Multivariate regression chains are used for the study of panel data and are simplified using zero constraints, see Cox and Wermuth (1993, 1996). By using the same type of repeated transformations as for multivariate regression chains, but selecting different sets of parameters, distinct types of linear chain graph models are defined, see Wermuth, Wiedenbeck and Cox (2006). For instance, models result that have been called blocked concentration chains or LWF-chains (for Lauritzen and Wermuth (1989) and Frydenberg( 1990)) and those called mixed concentration-regression chains or AMP-chains (for Andersson, Madigan and Perlman (2001)). The interpretation of zero constraints depends on the selected type of parameter so that the different sets of zero constraints permit to formulate various types of research hypothesis that may be of interest in different subject-matter contexts. Some direct extensions are to constraints in nonlinear chain graph models within the context of exponential family models discussed in the next sections. Typically, any conditional independence hypothesis involves then several parameters that are constrained to be zero. 13

14 3. Estimation in reduced exponential families For a full exponential family, we take the log likelihood, after disregarding terms that do not depend on the unknown parameter, in the form s T φ K(φ), where φ is the canonical parameter and s the sufficient statistic, a realization of the random variable S. The cumulant generating function of S under the full exponential family, K(φ + t) K(φ), gives the mean or moment parameter, η = K(φ), as the gradient of K(φ) with respect to φ, i.e. as a vector of first derivatives. The maximum-likelihood estimate of η is ˆη = s. The gradient of η with respect to φ gives the covariance matrix of S but also the concentration matrix of the maximum-likelihood estimate ˆφ of the canonical parameter, that is cov(s) = T K(φ) = con(ˆφ), where ˆφ denotes for simplicity both a maximum-likelihood estimate of φ and the corresponding random variable. The notation I = T K(φ) reminds one that R. A. Fisher had interpreted minus this derivative as the information about the canonical parameter contained in a single observation. Given ˆη and Ī, the observed information matrix, i.e. minus I evaluated at s, studentized statistics for testing that an individual component η i of η is zero, are obtained for a large sample size n as ˆη i / Īii. Similarly, since the random variable I 1 (ˆη η) has mean zero and covariance matrix I 1 and hence the same mean and variance as (ˆφ φ), a studentized statistics for testing that an individual component φ i of φ is zero, may for large n be computed as ˆφ i / (Ī 1 ) ii. At a maximum of the likelihood function under a full exponential model, also called often the saturated or the largest covering model, it holds that Ī 1ˆη = z, z = Ī 1 s, (3.1) which is in the form of (2.5) so that the results of the previous section apply to it, especially as used below for (3.4). We now consider special reduced exponential models, those given by constraints η c = 0 14

15 for some subset of elements of η, writing η = (η u, η c ). By differentiating the Lagrangian s T φ K(φ) λ T η c. with respect to φ, the maximum likelihood estimating equations are obtained as ˆη u = s u ÎucÎ 1 cc s c, ˆη c = 0, (3.2) where the matrix I is evaluated at the maximum likelihood estimate in the reduced model. Since the solution of maximum-likelihood equations requires in general iterative algorithms and may not be unique, an efficient closed form approximation is useful, see Cox and Wermuth (1990), Wermuth, Cox and Marchetti (2006), that has been called the reduced model estimate of η u. Equation (3.2) is thereby modified into η u = s u ĨucĨ 1 cc s c, η c = 0, (3.3) where the (u, c) and (c, c) components of Î in (3.2) have been replaced by the corresponding components of the asymptotic covariance matrix Ĩ of S and do not involve unknown parameters. Equations (3.2) and (3.3) result also, by using (2.9), and inv u I 1 = inv c I in (3.1) when the (u, c) and (c, c) components of the matrix I are replaced by Î and by Ĩ, respectively, and the (u, u) component is evaluated at η c = 0. Since the use of reduced model estimates is recommended only in situations in which the constraints agree well with the observed data, choosing the full observed matrix, Ī of ˆφ, under the saturated model and then partially inverting it on c, should not give estimates which differ much from those in equations (3.2) and (3.3). By equations (2.9), (3.1), we then have η u = s u ĪucĪ 1 cc s c, Īcc.u η c = s c (ĪucĪ 1 cc )T s u (3.4) with η u = η u, cov( η u ) = Īuu c and cov( η c ) = Īcc. In the case of a poor fit to the hypothesis η c = 0, i.e. with some components of η c deviating much from zero, equations (2.10) permit a direct change to the fit under an alternative hypothesis. Since ĪucĪ 1 cc can be viewed as the observed coefficient of S c in linear least-squares regression of S u on S c, standard results apply for testing that 0 = I uc I 1 cc given an estimate of the appropriate covariance matrix. Expansions into relevant interaction parameters lead to studentized statistics of interaction terms and provide insight into where a possibly poor fit is located. See Cox and Wermuth, (1990) for examples and Lauritzen and Wermuth 15

16 (1989) for a discussion of interaction parameters in the case of Conditional-Gaussian (CG) distributions and of CG-regressions. The latter contain for instance logistic regression as a special case for which, in general, iterative fitting algorithms are needed to give ˆη even under the saturated model, see Edwards and Lauritzen (2001). The closed form estimates of η u in (3.4) provide a new justification for the reduced model estimates: they result by partially inverting equations (3.1) with respect to the subset given by the set of unconstrained parameters. These estimates can also be viewed as the generalized least squares estimates of Aitken (1935), which turn for instance for the multinomial distribution to the estimates of Grizzle, Starmer and Koch (1989), see Cox and Snell (1981, appendix 1), Cox and Wermuth (1990, section 7). Under some general regularity conditions, equations (3.1) and hence the estimates of η u in (3.4) arise in more general settings than for the exponential family from asymptotic theory, see Cox (2006, chapter 6). They are then called sandwich estimates which had been introduced in a special context by Huber (1964) or, in the context of generalized linear models, approximate residual maximum-likelihood (REML) estimates which had been derived by Patterson and Thompson (1971). By similar arguments as above, the maximum-likelihood equations with zero constraints on canonical parameters are obtained in a form comparable to equations (3.2). From a theoretical viewpoint, these may be more attractive than zero constraints on moment parameters. Firstly, if the constrained canonical parameters exist, then the maximum-likelihood equations have a unique solution. Secondly, the sets of minimal sufficient statistics are of reduced size. Thirdly, estimates are available in closed form for Gaussian and for multinomial distributions provided the model is decomposable that is the associated independence graph can be arranged in a sequence of possibly overlapping but complete prime graphs, see Cox and Wermuth (1999). For non-decomposable models, it is in addition often simple to find a not much larger, decomposable covering model. 4. Switching partially between canonical and moment parameters So far, we have considered only linear mappings even though these were relevant both for nonlinear model formulations and for correlated data. We shall now illustrate how changes between moment and canonical parameters in exponential families may be obtained in terms of partial replication. A general formulation has been given by Cox (2006), section 6.4, including a short proof for orthogonality, i.e. uncorrelatedness, of canonical and moment parameters and of the asymptotic independence of the corresponding maximum-likelihood estimates, see also Barndorff-Nielsen (1978, section 9.8). These general results have often been proven 16

17 for specific members of exponential families, involving the then necessary lengthy, detailed arguments. In the notation of the present paper, for the moment parameter η = K(φ), and the canonical parameter φ with a maximum-likelihood estimate denoted by ˆφ, let s = (s a, s b ), represent a split of the sufficient statistic into two column vectors and let φ b be replaced by η b, the corresponding component of η, to give the mixed parameter vector ψ = (φ a, η b ). Then with ) I = ηt φ, rep a I = ψt φ = ( Iaa 0 ab one obtains, given ˆφ, a maximum-likelihood estimate of ψ under the saturated model as ˆψ = rep a Ī ˆφ, I ba cov( ˆ ˆψ) = (rep a Ī) Ī 1 (rep a Ī) T = I bb, (Ī 1 aa b 0 ab 0 ba Ī bb ). (4.1) The two random variables corresponding to ˆφ a and ˆη b are uncorrelated and, hence given their asymptotic joint Gaussian distribution, they are also asymptotically independent. One simple example is a joint Gaussian distribution with ˆη b the observed mean of Y b and ˆφ a the observed overall concentration matrix of Y a. Another is for two dichotomous variables with ˆη b the difference in observed frequencies for Y b and ˆφ a the observed log-odds ratios of Y a. Two complementary mappings may, respectively, be represented by (φ a ; η b ) = rep a Iφ, (η a ; φ b ) = rep b Iφ so that for such mappings, we have with the specific choice M = I, as in equations (1.2), (1.8), (η a ; φ b ) = inv a I (φ a ; η b ), inv a I = (rep a I)(rep b I) 1. (4.2) One important consequence of (4.2) is that, given (4.1), the change from a split (a, b) to another split (c, d) with canonical parameters for c and moment parameters for d is possible by using (2.10) after just replacing Σ by I. Another application is to constrained chain graph models of different type when the conditional distributions are members of the exponential family. For the computation of the observed information matrix in the case of both discrete and continuous variables, see Dempster (1973), and Cox and Wermuth (1990). Whenever the population matrix I is replaced by its observed counterpart Ī, the linear relation in 17

18 equation (4.2) between sets of parameters in possibly nonlinear models turns into a linear relation between the corresponding maximum-likelihood estimates. Acknowledgement We thank the Swedish Research Society for supporting our cooperation via the Gothenburg Stochastic Center and D.R. Cox and G.M. Marchetti for their most helpful comments. References Aitken, A. C. (1935). On least squares and linear combination of observations. Proc. Roy. Soc. Edin. A 55, Anderson, T. W. (1958). An Introduction to Multivariate Statistical Analysis. Wiley, New York. Anderson, T. W. (1969). Statistical inference for covariance matrices with linear structure. In: Multivariate Analysis II (Edited by P.R. Krishnaiah), 55-66, Academic Press, New York. Anderson, T. W. (1973). Asymptotically efficient estimation of covariance matrices with linear structure. Ann. Statist. 1, Andersson, S., Madigan, D., and Perlman, M.D. (2001). Alternative Markov properties for chain graphs, Scand. J. Statist., 28, Barndorff-Nielsen, O. E. (1978). Information and Exponential Families in Statistical Theory. Wiley, Chichester. Brito, C. and Pearl, J. (2002). A new identification condition for recursive models with correlated errors. Struct. Equation Mod. 9, Cochran, W. G. (1938). The omission or addition of an independent variate in multiple linear regression. Suppl. J. Roy. Statist. Soc. 5, Cox, D. R. (2006). Principles of Statistical Inference Cambridge University Press, Cambridge. Cox, D.R. (2007). On a generalization of a result of W.G. Cochran. Biometrika 94. To appear. Cox, D. R. and Wermuth, N. (1990). An approximation to maximum-likelihood estimates in reduced models. Biometrika 77, Cox, D. R. and Wermuth, N. (1993). Linear dependencies represented by chain graphs (with discussion). Statist. Science 8, ; Cox, D. R. and Wermuth, N. (1996). Multivariate Dependencies Models, Analysis and Interpretation. Chapman and Hall, London. 18

19 Cox, D.R. and Wermuth, N. (1999). Likelihood factorizations for mixed discrete and continuous variables. Scand. J. Statist. 26, Cox, D.R. and Wermuth, N. (2003). A general condition for avoiding effect reversal after marginalization. J. Roy. Statist. Soc. B, 65, Cox, D. R. and Wermuth, N. (2004). Causality: a statistical view. Int. Statist. Rev. 72, Dempster, A. P. (1969). Elements of Continuous Multivariate Analysis. Addison-Wesley, Reading. Dempster, A. P. (1972). Covariance selection. Biometrics 28, Dempster, A.P. (1973). Aspects of the multinomial logit model. In: Proc. 3rd Aymp. Mult. Anal. (Edited by P.R. Krishnaiah), Academic Press, New York. Drton, M. and Richardson, T.S. (2004). Multimodality of the likelihood in the bivariate seemingly unrelated regression model. Biometrika 91, Drton, M., Eichler, M. and Richardson, T. S. (2007). Identification and likelihood inference for recursive linear models with correlated errors. Submitted. Edwards, D. and Lauritzen, S. L. (2001). The TM algorithm for maximising a conditional likelihood function. Biometrika 88, Eusebi, P. (2007). A graphical method for assessing the identification of linear structural equation models. Struct, Equation Modelling. To appear. Fischer, F. (1966). The Identification Problem in Econometrics. McGraw Hill, New York. Frydenberg, M. (1990). The chain graph Markov property. Scand. J. Statist. 17, Grizzle, J. E., Starmer, C. F. and Koch, G. C., (1969). Analysis of categorical data by linear models. Biometrics 28, Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Stat. 35, Lauritzen, S. L. (2000). Causal inference from graphical models. In Complex stochastic systems (Edited by O.E. Barndorff-Nielsen, D. R. Cox and C. Klüppelberg), Chapman and Hall, London. Lauritzen, S. L. and N. Wermuth (1989). Graphical models for associations between variables, some of which are qualitative and some quantitative. Ann. Statist. 17, Lindley, D. V. (2002). Seeing and doing: the concept of causation. Int. Statist. Rev. 70,

20 Ma, Z., Xie, X. and Geng, Z. (2006). Collapsibility of distribution dependence. J. R. Statist. Soc. B 68, Marchetti, G. M. and Wermuth, N. (2007). Matrix representations and independencies in directed acyclic graphs. Submitted. Patterson, H. D. and Thompson, R. (1971). Recovery of interblock information when block sizes are unequal. Biometrika, 58, Pearl, J. (2000). Causality. Cambridge University Press. Rubin, D. B. (1974). Estimating causal effect of treatments in randomized and nonrandomized studies. J. Educational Psychol. 66, Wermuth, N. and Cox, D. R. (2004). Joint response graphs and separation induced by triangular systems. J. Roy. Statist. Soc. B 66, Wermuth, N. and Cox, D. R. (2007). Distortion of effects caused by indirect confounding. Biometrika. To appear. Wermuth, N., Cox, D. R. and Marchetti, G. M. (2006). Covariance chains. Bernoulli 12, Wermuth, N., Wiedenbeck, M. and Cox, D. R. (2006). Partial inversion for linear systems and partial closure of independence graphs. BIT, Num. Math. 46, Xie, X., Ma, Z. and Geng, Z. (2007). Some association measures and their collapsibility. Statistica Sinica. To appear. Zellner, A. (1962). An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J. Amer. Statist. Assoc. 57, Center of Survey Research, Mannheim, Germany wiedenbeck@zuma-mannheim.de Department of Statistics, Chalmers/Göteborgs Universitet, Sweden wermuth@math.chalmers.se 20

Matrix representations and independencies in directed acyclic graphs

Matrix representations and independencies in directed acyclic graphs Matrix representations and independencies in directed acyclic graphs Giovanni M. Marchetti, Dipartimento di Statistica, University of Florence, Italy Nanny Wermuth, Mathematical Statistics, Chalmers/Göteborgs

More information

MATRIX REPRESENTATIONS AND INDEPENDENCIES IN DIRECTED ACYCLIC GRAPHS

MATRIX REPRESENTATIONS AND INDEPENDENCIES IN DIRECTED ACYCLIC GRAPHS Submitted to the Annals of Statistics MATRIX REPRESENTATIONS AND INDEPENDENCIES IN DIRECTED ACYCLIC GRAPHS By Giovanni M. Marchetti and Nanny Wermuth University of Florence and Chalmers/Göteborgs Universitet

More information

ARTICLE IN PRESS. Journal of Multivariate Analysis ( ) Contents lists available at ScienceDirect. Journal of Multivariate Analysis

ARTICLE IN PRESS. Journal of Multivariate Analysis ( ) Contents lists available at ScienceDirect. Journal of Multivariate Analysis Journal of Multivariate Analysis ( ) Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Marginal parameterizations of discrete models

More information

Regression models for multivariate ordered responses via the Plackett distribution

Regression models for multivariate ordered responses via the Plackett distribution Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

More information

MATRIX REPRESENTATIONS AND INDEPENDENCIES IN DIRECTED ACYCLIC GRAPHS

MATRIX REPRESENTATIONS AND INDEPENDENCIES IN DIRECTED ACYCLIC GRAPHS The Annals of Statistics 2009, Vol. 37, No. 2, 961 978 DOI: 10.1214/08-AOS594 Institute of Mathematical Statistics, 2009 MATRIX REPRESENTATIONS AND INDEPENDENCIES IN DIRECTED ACYCLIC GRAPHS BY GIOVANNI

More information

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Proceedings of Machine Learning Research vol 73:21-32, 2017 AMBN 2017 Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Jose M. Peña Linköping University Linköping (Sweden) jose.m.pena@liu.se

More information

Probability distributions with summary graph structure

Probability distributions with summary graph structure Bernoulli 17(3), 2011, 845 879 DOI: 10.3150/10-BEJ309 Probability distributions with summary graph structure NANNY WERMUTH Mathematical Statistics at Chalmers, University of Gothenburg, Chalmers Tvärgata

More information

MULTIVARIATE STATISTICAL ANALYSIS. Nanny Wermuth 1

MULTIVARIATE STATISTICAL ANALYSIS. Nanny Wermuth 1 MULTIVARIATE STATISTICAL ANALYSIS Nanny Wermuth 1 Professor of Statistics, Department of Mathematical Sciences Chalmers Technical University/University of Gothenburg, Sweden Classical multivariate statistical

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

PROBABILITY DISTRIBUTIONS WITH SUMMARY GRAPH STRUCTURE. By Nanny Wermuth Mathematical Statistics at Chalmers/ University of Gothenburg, Sweden

PROBABILITY DISTRIBUTIONS WITH SUMMARY GRAPH STRUCTURE. By Nanny Wermuth Mathematical Statistics at Chalmers/ University of Gothenburg, Sweden PROBABILITY DISTRIBUTIONS WITH SUMMARY GRAPH STRUCTURE By Nanny Wermuth Mathematical Statistics at Chalmers/ University of Gothenburg, Sweden A joint density of many variables may satisfy a possibly large

More information

Distortion of effects caused by indirect confounding

Distortion of effects caused by indirect confounding Biometrika (2008), 95, 1,pp. 17 33 C 2008 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asm092 Advance Access publication 4 February 2008 Distortion of effects caused by indirect confounding

More information

arxiv: v2 [stat.me] 5 May 2016

arxiv: v2 [stat.me] 5 May 2016 Palindromic Bernoulli distributions Giovanni M. Marchetti Dipartimento di Statistica, Informatica, Applicazioni G. Parenti, Florence, Italy e-mail: giovanni.marchetti@disia.unifi.it and Nanny Wermuth arxiv:1510.09072v2

More information

Smoothness of conditional independence models for discrete data

Smoothness of conditional independence models for discrete data Smoothness of conditional independence models for discrete data A. Forcina, Dipartimento di Economia, Finanza e Statistica, University of Perugia, Italy December 6, 2010 Abstract We investigate the family

More information

Flat and multimodal likelihoods and model lack of fit in curved exponential families

Flat and multimodal likelihoods and model lack of fit in curved exponential families Mathematical Statistics Stockholm University Flat and multimodal likelihoods and model lack of fit in curved exponential families Rolf Sundberg Research Report 2009:1 ISSN 1650-0377 Postal address: Mathematical

More information

Decomposable and Directed Graphical Gaussian Models

Decomposable and Directed Graphical Gaussian Models Decomposable Decomposable and Directed Graphical Gaussian Models Graphical Models and Inference, Lecture 13, Michaelmas Term 2009 November 26, 2009 Decomposable Definition Basic properties Wishart density

More information

Likelihood Analysis of Gaussian Graphical Models

Likelihood Analysis of Gaussian Graphical Models Faculty of Science Likelihood Analysis of Gaussian Graphical Models Ste en Lauritzen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 2 Slide 1/43 Overview of lectures Lecture 1 Markov Properties

More information

Learning Multivariate Regression Chain Graphs under Faithfulness

Learning Multivariate Regression Chain Graphs under Faithfulness Sixth European Workshop on Probabilistic Graphical Models, Granada, Spain, 2012 Learning Multivariate Regression Chain Graphs under Faithfulness Dag Sonntag ADIT, IDA, Linköping University, Sweden dag.sonntag@liu.se

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Graphical Models with Symmetry

Graphical Models with Symmetry Wald Lecture, World Meeting on Probability and Statistics Istanbul 2012 Sparse graphical models with few parameters can describe complex phenomena. Introduce symmetry to obtain further parsimony so models

More information

Marginal log-linear parameterization of conditional independence models

Marginal log-linear parameterization of conditional independence models Marginal log-linear parameterization of conditional independence models Tamás Rudas Department of Statistics, Faculty of Social Sciences ötvös Loránd University, Budapest rudas@tarki.hu Wicher Bergsma

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Maximum Likelihood Estimation in Gaussian AMP Chain Graph Models and Gaussian Ancestral Graph Models

Maximum Likelihood Estimation in Gaussian AMP Chain Graph Models and Gaussian Ancestral Graph Models Maximum Likelihood Estimation in Gaussian AMP Chain Graph Models and Gaussian Ancestral Graph Models Mathias Drton A dissertation submitted in partial fulfillment of the requirements for the degree of

More information

Graphical Gaussian Models with Edge and Vertex Symmetries

Graphical Gaussian Models with Edge and Vertex Symmetries Graphical Gaussian Models with Edge and Vertex Symmetries Søren Højsgaard Aarhus University, Denmark Steffen L. Lauritzen University of Oxford, United Kingdom Summary. In this paper we introduce new types

More information

Fisher Information in Gaussian Graphical Models

Fisher Information in Gaussian Graphical Models Fisher Information in Gaussian Graphical Models Jason K. Johnson September 21, 2006 Abstract This note summarizes various derivations, formulas and computational algorithms relevant to the Fisher information

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Learning Marginal AMP Chain Graphs under Faithfulness

Learning Marginal AMP Chain Graphs under Faithfulness Learning Marginal AMP Chain Graphs under Faithfulness Jose M. Peña ADIT, IDA, Linköping University, SE-58183 Linköping, Sweden jose.m.pena@liu.se Abstract. Marginal AMP chain graphs are a recently introduced

More information

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a

More information

Testing Equality of Natural Parameters for Generalized Riesz Distributions

Testing Equality of Natural Parameters for Generalized Riesz Distributions Testing Equality of Natural Parameters for Generalized Riesz Distributions Jesse Crawford Department of Mathematics Tarleton State University jcrawford@tarleton.edu faculty.tarleton.edu/crawford April

More information

A SINful Approach to Gaussian Graphical Model Selection

A SINful Approach to Gaussian Graphical Model Selection A SINful Approach to Gaussian Graphical Model Selection MATHIAS DRTON AND MICHAEL D. PERLMAN Abstract. Multivariate Gaussian graphical models are defined in terms of Markov properties, i.e., conditional

More information

arxiv: v1 [stat.me] 5 Apr 2017

arxiv: v1 [stat.me] 5 Apr 2017 Generating large Ising models with Markov structure via simple linear relations Nanny Wermuth Chalmers University of Technology, Gothenburg, Sweden and Gutenberg-University, Mainz, Germany Giovanni Marchetti

More information

Algebraic Representations of Gaussian Markov Combinations

Algebraic Representations of Gaussian Markov Combinations Submitted to the Bernoulli Algebraic Representations of Gaussian Markov Combinations M. SOFIA MASSA 1 and EVA RICCOMAGNO 2 1 Department of Statistics, University of Oxford, 1 South Parks Road, Oxford,

More information

Primitive Ideals of Semigroup Graded Rings

Primitive Ideals of Semigroup Graded Rings Sacred Heart University DigitalCommons@SHU Mathematics Faculty Publications Mathematics Department 2004 Primitive Ideals of Semigroup Graded Rings Hema Gopalakrishnan Sacred Heart University, gopalakrishnanh@sacredheart.edu

More information

Directed acyclic graphs and the use of linear mixed models

Directed acyclic graphs and the use of linear mixed models Directed acyclic graphs and the use of linear mixed models Siem H. Heisterkamp 1,2 1 Groningen Bioinformatics Centre, University of Groningen 2 Biostatistics and Research Decision Sciences (BARDS), MSD,

More information

Iterative Conditional Fitting for Gaussian Ancestral Graph Models

Iterative Conditional Fitting for Gaussian Ancestral Graph Models Iterative Conditional Fitting for Gaussian Ancestral Graph Models Mathias Drton Department of Statistics University of Washington Seattle, WA 98195-322 Thomas S. Richardson Department of Statistics University

More information

AN EXTENDED CLASS OF MARGINAL LINK FUNCTIONS FOR MODELLING CONTINGENCY TABLES BY EQUALITY AND INEQUALITY CONSTRAINTS

AN EXTENDED CLASS OF MARGINAL LINK FUNCTIONS FOR MODELLING CONTINGENCY TABLES BY EQUALITY AND INEQUALITY CONSTRAINTS Statistica Sinica 17(2007), 691-711 AN EXTENDED CLASS OF MARGINAL LINK FUNCTIONS FOR MODELLING CONTINGENCY TABLES BY EQUALITY AND INEQUALITY CONSTRAINTS Francesco Bartolucci 1, Roberto Colombi 2 and Antonio

More information

On the Identification of a Class of Linear Models

On the Identification of a Class of Linear Models On the Identification of a Class of Linear Models Jin Tian Department of Computer Science Iowa State University Ames, IA 50011 jtian@cs.iastate.edu Abstract This paper deals with the problem of identifying

More information

arxiv: v2 [stat.me] 9 Oct 2015

arxiv: v2 [stat.me] 9 Oct 2015 In: (Balakrishnan, N. et al., eds) Wiley StatsRef: Statistics Reference Online (2015); to appear; also on ArXiv: 1505.02456 Graphical Markov models, unifying results and their interpretation Nanny Wermuth

More information

arxiv: v1 [stat.me] 5 May 2018 Monia Lupparelli

arxiv: v1 [stat.me] 5 May 2018 Monia Lupparelli Conditional and marginal relative risk parameters for a class of recursive regression graph models arxiv:1805.02044v1 [stat.me] 5 May 2018 Monia Lupparelli Department of Statistical Sciences, University

More information

Total positivity in Markov structures

Total positivity in Markov structures 1 based on joint work with Shaun Fallat, Kayvan Sadeghi, Caroline Uhler, Nanny Wermuth, and Piotr Zwiernik (arxiv:1510.01290) Faculty of Science Total positivity in Markov structures Steffen Lauritzen

More information

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Working Paper 2013:9 Department of Statistics Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Ronnie Pingel Working Paper 2013:9 June

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure) Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value

More information

Chain graph models of multivariate regression type for categorical data

Chain graph models of multivariate regression type for categorical data Bernoulli 17(3), 2011, 827 844 DOI: 10.3150/10-BEJ300 arxiv:0906.2098v3 [stat.me] 13 Jul 2011 Chain graph models of multivariate regression type for categorical data GIOVANNI M. MARCHETTI 1 and MONIA LUPPARELLI

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

On identification of multi-factor models with correlated residuals

On identification of multi-factor models with correlated residuals Biometrika (2004), 91, 1, pp. 141 151 2004 Biometrika Trust Printed in Great Britain On identification of multi-factor models with correlated residuals BY MICHEL GRZEBYK Department of Pollutants Metrology,

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley Zellner s Seemingly Unrelated Regressions Model James L. Powell Department of Economics University of California, Berkeley Overview The seemingly unrelated regressions (SUR) model, proposed by Zellner,

More information

Correspondence Analysis of Longitudinal Data

Correspondence Analysis of Longitudinal Data Correspondence Analysis of Longitudinal Data Mark de Rooij* LEIDEN UNIVERSITY, LEIDEN, NETHERLANDS Peter van der G. M. Heijden UTRECHT UNIVERSITY, UTRECHT, NETHERLANDS *Corresponding author (rooijm@fsw.leidenuniv.nl)

More information

Comment on Article by Scutari

Comment on Article by Scutari Bayesian Analysis (2013) 8, Number 3, pp. 543 548 Comment on Article by Scutari Hao Wang Scutari s paper studies properties of the distribution of graphs ppgq. This is an interesting angle because it differs

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

On Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions

On Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions Journal of Applied Probability & Statistics Vol. 5, No. 1, xxx xxx c 2010 Dixie W Publishing Corporation, U. S. A. On Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions

More information

Bias-corrected AIC for selecting variables in Poisson regression models

Bias-corrected AIC for selecting variables in Poisson regression models Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2 MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS SYSTEMS OF EQUATIONS AND MATRICES Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Surveys in Mathematics and its Applications ISSN 842-6298 (electronic), 843-7265 (print) Volume 5 (200), 3 320 IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Arunava Mukherjea

More information

A MCMC Approach for Learning the Structure of Gaussian Acyclic Directed Mixed Graphs

A MCMC Approach for Learning the Structure of Gaussian Acyclic Directed Mixed Graphs A MCMC Approach for Learning the Structure of Gaussian Acyclic Directed Mixed Graphs Ricardo Silva Abstract Graphical models are widely used to encode conditional independence constraints and causal assumptions,

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Statistica Sinica Preprint No: SS

Statistica Sinica Preprint No: SS Statistica Sinica Preprint No: SS-206-0508 Title Combining likelihood and significance functions Manuscript ID SS-206-0508 URL http://www.stat.sinica.edu.tw/statistica/ DOI 0.5705/ss.20206.0508 Complete

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information

RECENT DEVELOPMENTS IN VARIANCE COMPONENT ESTIMATION

RECENT DEVELOPMENTS IN VARIANCE COMPONENT ESTIMATION Libraries Conference on Applied Statistics in Agriculture 1989-1st Annual Conference Proceedings RECENT DEVELOPMENTS IN VARIANCE COMPONENT ESTIMATION R. R. Hocking Follow this and additional works at:

More information

Regular Sparse Crossbar Concentrators

Regular Sparse Crossbar Concentrators Regular Sparse Crossbar Concentrators Weiming Guo and A. Yavuz Oruç Electrical Engineering Department College Park, MD 20742 ABSTRACT A bipartite concentrator is a single stage sparse crossbar switching

More information

SEM 2: Structural Equation Modeling

SEM 2: Structural Equation Modeling SEM 2: Structural Equation Modeling Week 2 - Causality and equivalent models Sacha Epskamp 15-05-2018 Covariance Algebra Let Var(x) indicate the variance of x and Cov(x, y) indicate the covariance between

More information

On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly Unrelated Regression Model and a Heteroscedastic Model

On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly Unrelated Regression Model and a Heteroscedastic Model Journal of Multivariate Analysis 70, 8694 (1999) Article ID jmva.1999.1817, available online at http:www.idealibrary.com on On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Elementary maths for GMT

Elementary maths for GMT Elementary maths for GMT Linear Algebra Part 2: Matrices, Elimination and Determinant m n matrices The system of m linear equations in n variables x 1, x 2,, x n a 11 x 1 + a 12 x 2 + + a 1n x n = b 1

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

arxiv: v1 [stat.me] 5 Apr 2013

arxiv: v1 [stat.me] 5 Apr 2013 Logistic regression geometry Karim Anaya-Izquierdo arxiv:1304.1720v1 [stat.me] 5 Apr 2013 London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK e-mail: karim.anaya@lshtm.ac.uk and Frank Critchley

More information

3 Undirected Graphical Models

3 Undirected Graphical Models Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 3 Undirected Graphical Models In this lecture, we discuss undirected

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

Faithfulness of Probability Distributions and Graphs

Faithfulness of Probability Distributions and Graphs Journal of Machine Learning Research 18 (2017) 1-29 Submitted 5/17; Revised 11/17; Published 12/17 Faithfulness of Probability Distributions and Graphs Kayvan Sadeghi Statistical Laboratory University

More information

9 Graphical modelling of dynamic relationships in multivariate time series

9 Graphical modelling of dynamic relationships in multivariate time series 9 Graphical modelling of dynamic relationships in multivariate time series Michael Eichler Institut für Angewandte Mathematik Universität Heidelberg Germany SUMMARY The identification and analysis of interactions

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

RESEARCH REPORT. Estimation of sample spacing in stochastic processes. Anders Rønn-Nielsen, Jon Sporring and Eva B.

RESEARCH REPORT. Estimation of sample spacing in stochastic processes.   Anders Rønn-Nielsen, Jon Sporring and Eva B. CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING www.csgb.dk RESEARCH REPORT 6 Anders Rønn-Nielsen, Jon Sporring and Eva B. Vedel Jensen Estimation of sample spacing in stochastic processes No. 7,

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

2. Basic concepts on HMM models

2. Basic concepts on HMM models Hierarchical Multinomial Marginal Models Modelli gerarchici marginali per variabili casuali multinomiali Roberto Colombi Dipartimento di Ingegneria dell Informazione e Metodi Matematici, Università di

More information

Conditional Least Squares and Copulae in Claims Reserving for a Single Line of Business

Conditional Least Squares and Copulae in Claims Reserving for a Single Line of Business Conditional Least Squares and Copulae in Claims Reserving for a Single Line of Business Michal Pešta Charles University in Prague Faculty of Mathematics and Physics Ostap Okhrin Dresden University of Technology

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information

A nonparametric test for path dependence in discrete panel data

A nonparametric test for path dependence in discrete panel data A nonparametric test for path dependence in discrete panel data Maximilian Kasy Department of Economics, University of California - Los Angeles, 8283 Bunche Hall, Mail Stop: 147703, Los Angeles, CA 90095,

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Identification of the age-period-cohort model and the extended chain ladder model

Identification of the age-period-cohort model and the extended chain ladder model Identification of the age-period-cohort model and the extended chain ladder model By D. KUANG Department of Statistics, University of Oxford, Oxford OX TG, U.K. di.kuang@some.ox.ac.uk B. Nielsen Nuffield

More information

Regression. Oscar García

Regression. Oscar García Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is

More information

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M. TIME SERIES ANALYSIS Forecasting and Control Fifth Edition GEORGE E. P. BOX GWILYM M. JENKINS GREGORY C. REINSEL GRETA M. LJUNG Wiley CONTENTS PREFACE TO THE FIFTH EDITION PREFACE TO THE FOURTH EDITION

More information

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004 Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure

More information

Analysis of Survival Data Using Cox Model (Continuous Type)

Analysis of Survival Data Using Cox Model (Continuous Type) Australian Journal of Basic and Alied Sciences, 7(0): 60-607, 03 ISSN 99-878 Analysis of Survival Data Using Cox Model (Continuous Type) Khawla Mustafa Sadiq Department of Mathematics, Education College,

More information

Estimation of Unique Variances Using G-inverse Matrix in Factor Analysis

Estimation of Unique Variances Using G-inverse Matrix in Factor Analysis International Mathematical Forum, 3, 2008, no. 14, 671-676 Estimation of Unique Variances Using G-inverse Matrix in Factor Analysis Seval Süzülmüş Osmaniye Korkut Ata University Vocational High School

More information

Intrinsic products and factorizations of matrices

Intrinsic products and factorizations of matrices Available online at www.sciencedirect.com Linear Algebra and its Applications 428 (2008) 5 3 www.elsevier.com/locate/laa Intrinsic products and factorizations of matrices Miroslav Fiedler Academy of Sciences

More information

(K + L)(c x) = K(c x) + L(c x) (def of K + L) = K( x) + K( y) + L( x) + L( y) (K, L are linear) = (K L)( x) + (K L)( y).

(K + L)(c x) = K(c x) + L(c x) (def of K + L) = K( x) + K( y) + L( x) + L( y) (K, L are linear) = (K L)( x) + (K L)( y). Exercise 71 We have L( x) = x 1 L( v 1 ) + x 2 L( v 2 ) + + x n L( v n ) n = x i (a 1i w 1 + a 2i w 2 + + a mi w m ) i=1 ( n ) ( n ) ( n ) = x i a 1i w 1 + x i a 2i w 2 + + x i a mi w m i=1 Therefore y

More information

Consistent Bivariate Distribution

Consistent Bivariate Distribution A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of

More information

Instrumental Sets. Carlos Brito. 1 Introduction. 2 The Identification Problem

Instrumental Sets. Carlos Brito. 1 Introduction. 2 The Identification Problem 17 Instrumental Sets Carlos Brito 1 Introduction The research of Judea Pearl in the area of causality has been very much acclaimed. Here we highlight his contributions for the use of graphical languages

More information

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

MEANINGFUL REGRESSION COEFFICIENTS BUILT BY DATA GRADIENTS

MEANINGFUL REGRESSION COEFFICIENTS BUILT BY DATA GRADIENTS Advances in Adaptive Data Analysis Vol. 2, No. 4 (2010) 451 462 c World Scientific Publishing Company DOI: 10.1142/S1793536910000574 MEANINGFUL REGRESSION COEFFICIENTS BUILT BY DATA GRADIENTS STAN LIPOVETSKY

More information