Stochastic Quasi-likelihood for Case-Control Point Pattern Data
|
|
- Dylan Riley
- 5 years ago
- Views:
Transcription
1 Stochastic Quasi-likelihood for Case-Control Point Pattern Data Ganggang Xu, Rasmus Waagepetersen and Yongtao Guan January 8, 2018 Abstract We propose a novel stochastic quasi-likelihood estimation procedure for case-control point processes. Quasi-likelihood for point processes depends on a certain optimal weight function and for the new method the weight function is stochastic since it depends on the control point pattern. The new procedure also provides a computationally efficient implementation of quasi-likelihood for univariate point processes in which case a synthetic control point process is simulated by the user. Under mild conditions, the proposed approach yields consistent and asymptotically normal parameter estimators. We further show that the estimators are optimal in the sense that the associated Godambe information is maximal within a wide class of estimating functions for case-control point processes. The effectiveness of the proposed method is further illustrated using extensive simulation studies and two data examples. Some key words: Case-control data, Godambe information, Optimal estimating equations, Point process, Stochastic Quasi-likelihood. Short title: Stochastic Quasi-likelihood for Case-Control Point Pattern Data Ganggang Xu is Assistant Professor ( gang@math.binghamton.edu), Department of Mathematical Sciences, Binghamton University, State University of New York, NY Rasmus Waagepetersen is Professor ( rw@math.aau.dk), Department of Mathematical Sciences, Aalborg University, Denmark. Yongtao Guan is Leslie O. Barnes Professor ( yguan@bus.miami.edu), Department of Management Science, University of Miami, Coral Gables, FL Xu s research was supported by Collaboration Grants for Mathematicians from the Simons Foundation (Award Number: ). Waagepetersen s research was supported by The Danish Council for Independent Research-Natural Sciences, grant DFF Statistics for point processes in space and beyond, and by the Centre for Stochastic Geometry and Advanced Bioimaging, funded by grant 8721 from the Villum Foundation. Guan s research was supported by National Institutes of Health grant R01 CA The authors thank the editor, the associate editor and anonymous referees for their constructive comments that lead to substantial improvements of the article. The authors also thank Prof. Hansheng Wang and Mr. Yu Chen for their help in collecting the Beijing restaurant location data.
2 1 Introduction Today spatially referenced datasets on human activities can be easily harvested from social media platforms and mobile devices equipped with GPS. Such data can be of great interest e.g. to sociologists, geographers, economists and marketing analysts. As one example, we consider in Section 4.1 bivariate point pattern data obtained from the Chinese search engine baidu.com, with the locations of Chinese and Western-style restaurants in Beijing. Other examples include data from Twitter or analogues giving times of tweets and locations of the persons tweeting (Lu et al., 2016). For spatial point pattern data regarding human activities in urban environments, a particular challenge is the very complex form of the intensity function. For example, commercial restaurants in Beijing typically cannot be found in parks and certain official areas so that the intensity function in such areas is effectively zero. Also the intensity can vary abruptly when moving from one neighbourhood into another. This means that a full modeling of the intensity function would require very detailed information on the geography of the city and gathering such information can be a cumbersome task. The aforementioned difficulties are well-known in spatial epidemiology where spatial point processes have been used as an effective tool to investigate risk factors for various diseases; see for example, Diggle (1990), Diggle and Rowlingson (1994), Diggle et al. (1997), and Zimmerman et al. (2012). Suppose that a spatial point process N is used to model the locations of occurrences of a disease over a spatial domain W in the population at risk. Then a commonly used model for the intensity function λ N ( ) of N is λ N (s) = ψ(s)λ 0 (s), s W, (1) where λ 0 (s) serves as a baseline intensity function related to the population at risk and the nonnegative factor ψ(s) models the elevated or reduced risk that an individual located at s catches the disease. In this model, ψ( ) is of primary interest while λ 0 ( ) can be viewed as an 1
3 infinite dimensional nuisance parameter. Typically a parametric model ψ(s; β) is assumed for the dependence of the risk at location s on some risk factors Z(s) associated with s. For example, one popular choice is ψ(s; β) = exp{β T Z(s)} with Z(s) being some environmental, demographic, or life-style related variables at s W. With such a structure, the parameter β gives a direct interpretation of the potential risk related to the risk factors Z(s). For reasons similar to the ones mentioned previously, the specification of λ 0 ( ) is much more intricate and a simple parametric model may not be tenable. Instead it is common to estimate λ 0 ( ) nonparametrically, e.g., by using kernel smoothing (Diggle, 1990). However, the resulting estimator may not be consistent as argued by Guan (2008). The impact of using a potentially inconsistent estimator of λ 0 ( ) on the inference regarding the parameter β is further difficult to quantify. An appealing alternative is to use case-control data including an additional control process M, also observed over W. The intensity function of M is assumed to be of the form λ M (s) = α(s)λ 0 (s), s W, (2) where α(s) is the sampling intensity when collecting the control data. The value of α( ) is determined by the actual sampling scheme of the case-control study and is thus often considered known (Diggle and Rowlingson, 1994; Zimmerman et al., 2012). For the Beijing restaurants, for example, we will study the spatial pattern of Western-style restaurants using a random sample of the Chinese restaurants as a control process. To clarify the assumptions and scope of our modeling approach we consider the following illustrative example. Suppose there exist three sets of spatial covariates, X(s), Y(s) and Z(s), conditioned on which, the case process N and the control process M are independent Poisson processes with intensity functions Λ N (s) = exp { β T Y Y(s) + (β Z + β) T Z(s) + β T XX(s) } and Λ M (s) = α(s) exp { β T Y Y(s) + β T ZZ(s) }. (3) However, only Z(s) are collected in the observed data and both X(s) and Y(s) need to be treated 2
4 as latent processes. Note that the latent process Y(s) affect both the case and control processes equivalently, but X(s) only affects the case process. Assume that X(s) is independent of Y(s) and Z(s) and that β 0 = log [ E exp { β T XX(s) }] does not depend on s. Then conditioned on Y(s) and Z(s), the control process M is still a Poisson process with intensity λ M (s) = Λ M (s) = α(s)λ 0 (s) where λ 0 (s) = exp { β T Y Y(s) + β T ZZ(s) } and the case process N becomes a Cox process with an intensity λ N (s) = E {Λ N (s) Y(s), Z(s)} = λ 0 (s)ψ(s; β) where ψ(s; β) = exp { β 0 + β T Z(s) }. The first take-away from the above example is that the parameter β needs to be interpreted as the elevated/reduced impacts of Z(s) on the case intensity λ N (s) compared to their impacts on the control intensity λ M (s). Let Z j (s) and β j be the jth elements of Z(s) and β, respectively. Then, β j = 0 means that Z j (s) has the same effect on N and M. Secondly, a key assumption in (3) is that for all factors affecting both the case and control intensities, they are either observed, i.e., Z(s), or otherwise equivalently contributing to both intensities, such as Y(s). The major difficulty in treating the baseline intensity λ 0 (s) = exp { β T Y Y(s) + β T ZZ(s) } as an unknown deterministic function lies in that its estimation can be rather challenging without observing Y(s). This difficulty, however, can be avoided by using the proportional structure between the intensity functions defined in (1) and (2), where λ 0 (s) needs not to be estimated. Consequently, it enables our theoretical investigations to treat λ 0 (s) and ψ(s) in (1) and (2) as deterministic functions, conditioned on which N and M are assumed to be independent of each other with M being a Poisson process. Diggle and Rowlingson (1994) studied this case-control setting where N and M are independent Poisson processes and proposed a conditional likelihood approach to estimate the unknown parameter β. However, the strong independence properties implied by the Poisson assumption for the case process N may be too restrictive since they preclude possible interactions among the cases. For example, this may not be appropriate for modeling infectious diseases (Diggle et al., 1997; Diggle et al., 2007). For this reason, we consider the scenario where the case process N may have some clustering patterns. For example, in the above illustrative example, by allowing spatial 3
5 dependence in the latent process X(s) in (3), conditioned on λ 0 (s) = exp { β T Y Y(s) + β T ZZ(s) } and ψ(s; β) = exp { β 0 + β T Z(s) }, the case process N becomes a Cox process, which may have additional aggregations relative to the control process M. Diggle and Rowlingson s conditional likelihood may be viewed from an estimating function point of view. Consider the following spatial increment: U(ds; β) = N(ds) ψ(s; β) M(ds), (4) α(s) where N(ds) and M(ds) denote the numbers of points from N and M in an infinitesimal spatial increment ds located around the spatial location s. Note that E{N(ds)} = λ N (s)ds and E{M(ds)} = λ M (s)ds. Then based on (1) and (2), it is trivial to see that E{U(ds; β)} = 0. As a result, using general theories on estimating equations (see, e.g. Crowder, 1986), we can estimate the p 1 parameter vector β consistently by solving the following estimating equation: Q f (β) = W f(s; β)u(ds; β) = s N f(s; β) s M ψ(s; β) α(s) f(s; β) = 0 p, (5) where f(s; β) is a p 1 real vector-valued function and 0 p is a p 1 vector of zeros. When f(s; β) = γ(s) ψ(1) (s;β) ψ(s;β) where γ(s; β) = α(s) and α(s)+ψ(s;β) ψ(1) (s; β) = ψ(s;β) β, (5) becomes equivalent to the score function of Diggle and Rowlingson s conditional likelihood. In fact, Rathbun (2012) showed that if both N and M are Poisson processes, this choice of weight function is optimal in the sense of yielding minimal parameter estimation variance. However, when N is not Poisson, the conditional likelihood is no longer optimal. To the best of our knowledge, no previous work has been attempted to determine the optimal weight function f( ; β) for the class of estimating functions of the form (5). In this paper we fill this gap. Our approach is built upon a recent development on the quasi-likelihood method for spatial point processes (Guan et al., 2015), where the authors considered the problem of finding the optimal first-order estimating function for a single spatial point process by solving a certain Fredholm integral equation. The development of quasi-likelihood for case-control data faces two 4
6 major challenges. Firstly, there is an unobserved latent baseline intensity function λ 0 ( ). As we will see in Section 2, the theoretical optimal weight function is the solution to a Fredholm integral equation involving λ 0 ( ). If one chooses to estimate λ 0 ( ), then the benefits of using the case-control approach would be lost. Secondly, the quasi-likelihood method in Guan et al. (2015) relies on deterministic numerical approximations of two key integrals: (1) the integral in the estimating equation, see subsection 2.6; and (2) the integral in the Fredholm integral equation when solving for the optimal weight function. While the former introduces bias to the estimating equation that is difficult to quantify (Baddeley et al., 2014), the latter approximation may invalidate the asymptotic results obtained based on the theoretical optimal weight function. Further, both approximations require covariate information at all numerical quadrature points. For many case-control type of data, covariate information may be readily available at the observed case and control locations but not necessarily so for an arbitrary point in the study region. It may therefore require additional work to derive the covariate information at the quadrature points required for the deterministic numerical approximation. To overcome these challenges, we develop a stochastic quasi-likelihood approach for case-control data that does not rely on λ 0 ( ) and uses only the observed covariate information. We propose a carefully designed leaveone-out algorithm to eliminate estimation bias. We prove that our proposed approach leads to an asymptotically as efficient estimator as the theoretical optimal approach under suitable conditions. Furthermore, we derive the asymptotic distribution of the regression parameter estimators based on the estimated weight function and not its theoretical optimal counterpart. We also discuss how the method can be applied to univariate point pattern data by simulating a synthetic control point process. The rest of the paper is organized as follows. A detailed discussion is given in Section 2 on the motivation and practical implementation of the proposed method. Simulation studies are conducted in Section 3 and two real data applications are considered in Section 4. Asymptotic results are given in Section 5. A sketch of the proposed algorithm is given in the Appendix and 5
7 all technical proofs are collected in the supplementary material. 2 Stochastic quasi-likelihood using case-control data 2.1 Background In this paper we assume that the control process M is an inhomogeneous Poisson process independent of N. However, we allow for possible correlations between counts of N as quantified by the pair correlation function g(, ) of N defined through Cov {N(ds), N(dt)} = δ(s t)λ N (s; β)ds + λ N (s; β)λ N (t; β){g(s, t) 1}dsdt, (6) where δ( ) is the Dirac function satisfying δ(s t)dt = I(s A) and I( ) denotes the indicator A function. For a Poisson process where the counts in distinct sets are independent, g(s, t) = 1 for any s t. Values of g greater (smaller) than one typically corresponds to clustered (regular) behaviors of N. By (6), the pair correlation function is symmetric, g(s, t) = g(t, s). In addition we assume that g(s, t) depends on s and t only through s t. In other words, the case process N is assumed to be second-order intensity reweighted stationary (Baddeley et al., 2000). A popular measure of efficiency for estimating functions is the Godambe information (Song, 2007). For our estimating function (5), the Godambe information is G f (β) = S T f (β)v 1 f (β)s f (β), (7) where S f = E{ Q f (β)/ β T }, V f (β) = Var{Q f (β)}, and the expectation and variance are with respect to both the case process N and the control process M. For any two weight functions f 1 ( ; β) and f 2 ( ; β), Q f1 (β) is said to be more efficient than Q f2 (β) if G f1 (β) G f2 (β) is nonnegative definite, denoted as G f1 (β) G f2 (β). The optimal estimating function Q φ (β) is thus defined as the one associated with the optimal weight function φ( ; β) such that G φ (β) G f (β) for any f( ; β) : W R p. (8) 6
8 2.2 The optimal estimating function By Guan et al. (2015), a sufficient condition for φ( ; β) to be the optimal weight function is that S f (β) = Cov{Q f (β), Q φ (β)} for all f( ; β) : W R p. (9) By the definition of the spatial increment process U( ; β) in (4), it is trivial to show that Cov{U(ds; β), U(dt; β)} = δ(s t) { } U(ds; β) E = λ 0 (s)ψ (1) (s; β)ds, and β α(s) + ψ(s; β) λ N (s; β)ds+λ N (s; β)λ N (t; β){g(s, t) 1}dsdt, α(s) which leads to S f (β) = Cov{Q f (β), Q φ (β)} = λ 0 (s)f(s; β)ψ (1) (s; β) T ds, and (10) W λ 0 (s)λ 0 (t)ψ(s; β)ψ(t; β){g(s, t) 1}f(s; β)φ(t; β) T dsdt W W + λ 0 (s)γ 1 (s; β)ψ(s; β)f(s; β)φ(s; β) T ds, (11) W where γ( ; β) and ψ (1) ( ; β) were defined under the equation (5). Combining (9)-(11), the optimal weight function φ( ; β) is the solution to the following integral equation: φ(s; β) + γ(s; β) λ 0 (t)ψ(t; β){g(s, t) 1}φ(t; β)dt = γ(s; β) ψ(1) (s; β) W ψ(s; β), or equivalently, φ(s; β) + λ 0 (t)α(t)r(s, t; β)φ(t; β)dt = η(s; β), (12) W where R(s, t; β) = γ(s; β)ψ(t; β){g(s, t) 1}/α(t) and η(s; β) = γ(s; β) ψ(1) (s;β). When γ( ; β), ψ(s;β) ψ( ; β) and g(, ) are continuous functions and g(, ) 1 is a positive definite function, the solution to (12) is unique (Guan et al. 2015). From now on, for ease of notation, we suppress the dependence of the functions R(, ), γ( ), ψ( ), and η( ) on β whenever there is no ambiguity. When N is a Poisson point process, g(s, t) = 1 for any s, t. Then the optimal weight function obtained through (12) has a closed form φ d ( ; β) = η( ; β), for which the resulting Q φd 7
9 is equivalent to the score function of the conditional likelihood proposed by Diggle and Rowlingson (1994). Furthermore, φ d ( ; β) also coincides with the optimal weight function studied in Waagepetersen (2008) and Rathbun (2012) for inhomogeneous Poisson processes. Our approach is therefore an important generalization of the conditional likelihood approach by Diggle and Rowlingson (1994). For a more general point process N, the integral equation (12) appropriately takes into account the correlation structure of N as given by the pair correlation function g(, ). 2.3 A naive stochastic estimator of φ( ; β) using all controls The integral equation (12) belongs to the extensively studied class of Fredholm integral equation of the second kind, see e.g., Hackbusch, (1995), Zemyan, (2012) and Kress (2014). Guan et al. (2015) proposed to find an approximate solution of the Fredholm integral equation under their consideration through the Nyström method, which is based on a deterministic numerical quadrature approximation of the integral that effectively converts the integral equation into a matrix equation. As we argued in Section 1, such an approach may not be applicable in our setting because λ 0 ( ) is typically unknown and also complete knowledge of ψ(t) and η(t) may not be available at all quadrature points. Below we introduce a stochastic variant of the Nyström method which avoids the aforementioned difficulty. Considering the integral in (12), Campbell s formula gives { } λ 0 (t)α(t)r(s, t)φ(t; β)dt = E R(s, t)φ(t; β), s W. W t M W Therefore, a naive proposal to estimate φ( ; β) is to solve the following stochastic equation: φ(s; β) + R(s, t)φ(t; β) = η(s), where s M W. (13) t M W Note that λ 0 (s) is not present in the above equation. Denote by φ k (s; β) and η k (s) the kth components of φ(s; β) and η(s), respectively, for k = 1,..., p. Let further {s 1,..., s M } denote 8
10 a realization of M of cardinality M. Then the components of the solution to equation (13) are φ k (s 1, M; β) η k (s 1 ) φ k (s 2, M; β). = {I η M + R(M, M)} 1 k (s 2 ), k = 1,..., p, (14). φ k (s M, M; β) η k (s M ) where R(M, M) is an M M matrix whose ijth entry is R(s i, s j ), i, j = 1,..., M. Note that we deliberately emphasize the dependence of φ k (s i, M; β) on the entire control process M, since this dependence is critical to our theoretical investigation. Denoting φ(s i, M; β) = ( φ1 (s i, M; β),, φ p (s i, M; β)) T for any si M, we use the following interpolation for an arbitrary location s W, φ(s, M; β) = η(s) t M W Plugging φ(s, M; β) in for f(s; β) in (5), we obtain R(s, t) φ(t, M; β). (15) Q SQLn (β) = s N W φ(s, M; β) s M W φ(s, M; β) ψ(s) α(s). However, by the Campbell-Mecke theorem (Baddeley, 2007, Theorem 3.2), E M,N {Q SQLn (β)} = W } } ] [E M { φ(s, M; β) λ N (s) E s M { φ(s, M; β) λ 0 (s)ψ(s) ds, where E s M denotes the Palm expectation of the point process M at a point s W, see Baddeley (2007) for more details. Since the Palm distribution and the original point process distribution are different even in case of a Poisson process (Baddeley, 2007), it follows that E M,N {Q SQLn (β)} 0 p. Therefore, the naive plug-in estimating function Q SQLn (β) is biased, which in turn results in bias in the associated parameter estimator. Our empirical simulation studies confirm that the parameter estimator can be substantially biased. 2.4 Leave-one-out bias correction method The bias of the estimating function Q SQLn (β) can be corrected by an application of a generalized version of the Slivnyak-Mecke s Theorem (Mecke, 1967). More specifically, for a Poisson point 9
11 process M with an intensity function λ( ) and any function f(s, M), one has that { } E f(s, M\{s}) = E {f(s, M)} λ(s)ds, (16) s M W W provided that (16) is well-defined. Motivated by equation (16), we modify the estimating function Q SQLn (β) as Q SQLu (β) = s N W φ(s, M; β) s M W φ (s, M\{s}; β) ψ(s) α(s), (17) where φ(s, M\{s}; β) is as defined in (15) with M replaced by M\{s}. Using the Slivnyak- Mecke s Theorem (16) and because of independence between N and M, E M,N {Q SQLu (β)} = W } } ] [E M { φ(s, M; β) λ N (s) E M { φ(s, M; β) λ 0 (s)ψ(s) ds = 0 p, which shows that Q SQLu (β) is an unbiased estimating function. It appears that the computation of all weights φ(s, M\{s}; β), s M, requires inverting an ( M 1) ( M 1) matrix as described in (14) repeatedly for M times, which leads to a computational cost of O( M 4 ) floating operations. However, the following Lemma states that this can be avoided by providing a short-cut formula for the leave-one-out estimator φ(s, M\{s}; β). Lemma 1. For any s M, there is a one-to-one relationship between the estimators of φ(s, M; β) and φ(s, M\{s}; β) as follows φ(s, M\{s}; β) = φ(s, M; β) w ss (s, M\{s}), (18) where w ss (s, M\{s}) is the diagonal entry of the matrix {I M + R(M, M)} 1 corresponding to location s. The proof is given in the supplementary material. Lemma 1 shows that the leave-one-out estimator φ(s, M\{s}; β) at all control locations can be obtained by inverting an M M matrix just once as is needed in (14). The computational cost of evaluating Q SQLu (β) is thus of the order O( M 3 + N M ), the same as that of Q SQLn (β). 10
12 It remains to quantify the potential loss of Godambe information by replacing the optimal φ(s; β) by the estimates φ(s, M; β), s N or φ(s, M \{s}; β), s M. The Godambe information matrix for (17) is defined as G(β) = S T (β)v 1 (β)s(β), (19) where S(β) = E { Q SQLu (β)/ β T } = W λ 0(s)E{ φ(s, M; β)} { ψ(s, β)/ β T } ds and V(β) = Var {Q SQLu (β)}. Regarding Q SQLu (β)/ β T, the weight function φ(s, M; β) also depends on the parameter β but by (16), the terms in Q SQLu (β)/ β T involving φ(s, M; β)/ β T and φ(s, M\{s}; β)/ β T cancel after taking the expectation. The derivation of V(β) is more involved and will be addressed later in Section 5.1. Furthermore, we show in Section 5.3 that the difference between G(β) and the optimal Godambe information matrix G φ (β) as defined in (7) with f( ; β) replaced by φ( ; β) is asymptotically negligible. 2.5 Practical implementation details Although computing the estimating function (17) is quite straightforward due to Lemma 1, several issues need to be addressed in practice. First note that R(s, t) in the integral equation (12) is not symmetric, which is inconvenient both for practical implementation and for theoretical investigations. So we first derive a symmetric counterpart of R(s, t). Define functions m(s) = α(s) {α(s) + ψ(s)} 1/2 ψ(s) 1/2, φ (s; β) = m 1 (s)φ(s; β), η (s) = m 1 (s)η(s). (20) Then the integral equation (12) can be written as φ (s; β) + W λ 0 (t)α(t)r (s, t)φ (t; β)dt = η (s), (21) where R (s, t) = m(s)r(s, t)/m(t) = r(s)r(t){g(s, t) 1} and r(s) = ψ(s) 1/2 {α(s) + ψ(s)} 1/2. 11
13 Following the procedure from equations (14)-(15), we define φ k (s 1, M; β) ηk φ k (s (s 1) 2, M; β). = {I η M + R (M, M)} 1 k (s 2). φ k (s M, M; β) ηk (s M ), k = 1,..., p, (22) where R (M, M) is an M M matrix whose ijth entry is R (s i, s j ), i, j = 1,..., M. Similarly, denote φ (s i ; β) = ( φ 1(s i, M; β),, φ p(s i, M; β)) T for any si M. Then for an arbitrary location s W, we define the interpolated function φ (s, M; β) = η (s) R (s, t) φ (t, M; β), for any s W. (23) t M W By the above definition, we have the following relationship φ(s, M; β) = m(s) φ (s, M; β). (24) The second issue is the computation of the inverse matrix {I M + R (M, M)} 1 when M is large. Fortunately, by the definition of R (M, M) in (22), we can see that a significant portions of its entries may be very close to 0, depending on how fast the function g(s, t) decays to 1 as s t increases. Assume that the pair correlation function g(s, t) is isotropic and can be expressed in the form of g 0 ( s t ). Then we can create a tapered version R taper(m, M) such that its ijth entry is the same as that of R (M, M) if s i s j d taper for some d taper > 0 and 0 otherwise. Following Guan et al. (2015), the taper distance d taper is chosen such that {g 0 (d taper ) 1}/{g 0 (0) 1} = τ 0 for some small threshold τ 0. Then a sparse matrix Cholesky decomposition can be used to obtain {I M + R taper(m, M)} 1 computationally efficiently. In our simulation studies, we use τ 0 = 10 6 if M > 4, 000 and otherwise we use the exact R (M, M). The random labeling theorem for Poisson processes provides an alternative to tapering for reducing the computational burden of inverting R (M, M) when the cardinality of the control process M is high. The theorem states that for B 1, M can be randomly split into independent and identically distributed Poisson processes M 1,..., M B. Assuming that M has intensity func- 12
14 tion α( )λ 0 ( ), each M b has intensity function α( )λ 0 ( )/B. We may then apply the case-control approach to obtain an estimate β (b) for each pair (N, M b ). The cardinality of M b is roughly 1/B times that of M which makes the inversion of R (M b, M b ) more feasible. Finally, β is estimated by the average β = 1 B β (b) B b=1, whose theoretical properties are investigated in Corollary 1 in Section 5.1. Obviously, the choice of B plays an important role for this divide-and-conquer strategy, which will be addressed in a future work. Another issue is that we assumed full knowledge of the pair correlation function g(, ) when finding the weight function φ(s, M; β). In practice g(, ) needs to be estimated. It is a common practice to assume that g(, ) belongs to a parametric family, g(, ; θ), governed by a parameter vector θ, and estimate θ from the data. We first obtain an estimate θ of θ using Guan et al. (2008) and then plug in g(, ; θ) for g(, ) in (22). We construct confidence intervals for β based on approximate normality of β. The theoretical justification for this is Theorem 1 in Section 5.1 where consistency and asymptotic normality of β is stated under the condition that θ is consistent with a sufficiently fast rate of convergence. In Section 5.1 we also provide estimates for the covariance matrix of β. Our simulation studies in Section 3 support the validity of the confidence intervals for β. 2.6 Stochastic quasi-likelihood as Monte Carlo approximation Letting the intensity of controls tend to infinity, it is easy to see that (5) converges to Q f (β) = f(s; β) s N W f(s; β)λ 0 (s)ψ(s; β)ds. Thus, (5) can be viewed as a Monte Carlo approximation of Q f using M as a set of random quadrature points. Suppose that λ 0 (s) is known, which implies that λ N (s) is purely parametric with a known multiplicative offset. Then we may simulate a synthetic control process M of known intensity α( ) and approximate Q f (β) by (5) as an alternative to the deterministic quadrature approximation used in Guan et al. (2015). The use of deterministic quadrature approximation to 13
15 the integral in Q f (β) introduces bias that can be difficult to quantify (Baddeley et al., 2014). In contrast, unbiasedness can be maintained using our proposed stochastic approximation. The use of the random quadrature process M introduces additional parameter estimation error. The error can be reduced by using a larger control intensity α( ), but this will on the other hand increase the computing time due to the need to solve a larger matrix equation, c.f. Section 2.5. An alternative is to simulate several independent synthetic control processes and apply the divide-and-conquer strategy in Section 2.5. The use of replicated synthetic control processes is exemplified in Section Simulation Studies In this section, we conducted a simulation study to investigate the finite sample performance of the proposed method. Both the case and control processes were simulated over an n n square window with n = 1, 2 using the R package spatstat (Baddeley and Turner, 2005). For each n, we set the baseline intensity λ 0,n (s) = exp{β0,n M + Y (s) + Z(s)/4}, where Y (s) and Z(s) are two independent realizations of stationary and isotropic Gaussian random fields with mean 0 and β0,n M is chosen such that n 2 λ [0,n] 2 0,n (s)ds = 1; see Figure 1(a) for details. Then the case and control intensities were specified as λ N,n (s) = λ 0,n (s) exp{β N 0,n + Z(s)β 1 } and λ M,n (s) = α(s)λ 0,n (s)π(s), (25) where β 1 = 1 and the intercept β N 0,n were chosen so that on average 400n 2 case events were simulated. The function Π(s) was introduced here to allow various types of departure from the proportional assumption between λ N,n ( ) and λ M,n ( ) and will be specified later. The control processes were simulated using an inhomogeneous Poisson process by choosing α(s) equal to a constant α for all s W n, where α = 400, 500,..., 1500, The case processes were simulated 14
16 as inhomogeneous Thomas processes (Waagepetersen, 2007) with a pair correlation function g(s, t) = 1 + (4πω 2 κ) 1 exp { (4ω 2 ) 1 s t 2}, (26) where κ > 0 and ω > 0 are the intensity of the parent process and the dispersal parameter, respectively. We considered κ = 50, 100 and ω = 0.02, 0.04 for different clustering scenarios. 3.1 Correct model specification with Π(s) 1 In this subsection, we first consider the case scenario where the assumptions of (1) and (2) hold by setting Π(s) 1. For each simulated case and control processes, θ = (κ, ω) T was first estimated using the approach given in Guan et al. (2008) and then the proposed procedure was applied to estimate β0,n N and β 1 by plugging in the estimated θ n = ( κ n, ω n ) T. Three estimation approaches were considered: Diggle and Rowlingson (1994) s conditional likelihood estimate (CLE) and the two proposed stochastic quasi-likelihood estimation approaches based on the naive method (SQLn ) given in Section 2.3 and the unbiased version (SQLu) given in (17), where the leave-one-out correction was applied to φ(s, M\{s}) for s M. For the SQLu method, we also considered the situation when the parametric family of the pair correlation function was mis-specified. More specifically, instead of (26) we used a pair correlation function for a variance-gamma shot-noise Cox process (Jalilian et al., 2013) which has the incorrect exponential form g(s, t) = 1 + a 1 exp ( b 1 s t ), a > 0, b > 0. Summary statistics based on 1,000 simulations are presented in Table 1 and Figure 1, where rmse represents root mean square error of the parameter estimates and CP90 represents the coverage probabilities of the nominal 90% confidence intervals constructed using Theorem 1 by plugging in the estimated matrices given in (30). The first observation is that SQLn produced biased estimators for β 1, which confirms our discussion in Section 2.3. The large bias typically 15
17 led to a larger rmse than for CLE. Table 1 also suggests that this bias decreased as α increased. In contrast, the SQLu estimate of β 1 was close to unbiased. Table 1: Biases and rmses of the different estimators for β 1. CLE SQLn SQLu -Est.Thomas SQLu -Est.Exponential (κ, ω) n α BIAS rmse BIAS rmse BIAS rmse CP90 BIAS rmse CP90 (50,0.02) % % % % % % % % % % % % (50,0.04) % % % % % % % % % % % % (100,0.02) % % % % % % % % % % % % (100,0.04) % % % % % % % % % % % % In terms of rmse, the SQLu estimating function outperformed CLE in almost all cases and the improvement in rmse could be quite significant. In accordance with the asymptotic results in Theorem 1, the rmses were approximately halved when n was increased from 1 to 2. Figure 1(b)- (c) show that the rmses of the CLE did not necessarily decrease as α increased. In contrast, the rmses for SQLu decreased steadily as α increased. This indicates that CLE made less efficient use of the control processes than the SQLu method. In addition, Figure 1 (f) illustrates that the averages of the empirical W n 1 tr(g φn ) also increased steadily as α increased. Both 16
18 0.09 MSE MSE 0.14 y CLE SQLu True g(r) SQLu Est.Thomas SQLu Est.Exp. SQLu ASE (c) Estimation accuracies (κ=50, ω=0.04, n=2) (b) Estimation accuracies (κ=50, ω=0.04, n=1) (a) The baseline intensity λ0, 2(s) κ=50 ω=0.04 α=1/2 n=2 True g(r) n=1 n=2 90 tr(gφ^n) Wn 2.5 g(r) (f) The Godambe information (κ=50, ω=0.04) (e) Estimated PCF (Exponential) g(r) α (d) Estimated PCF (Thomas) True g(r) α x κ=50 ω=0.04 α=1/2 n= CLE SQLu True g(r) SQLu Est.Thomas SQLu Est.Exp. SQLu ASE r 0.06 r α Figure 1: (a) The baseline intensity function λ0 (s); (b)-(c): empirical rmses of estimates of β1 obtained using CLE and SQLu with the true g(, ), the estimated Thomas PCF and the estimated Exponential PCF. The abbreviation SQLu ASE is for the asymptotic standard errors given by Theorem 1; (d)-(e): the estimated Thomas PCFs and Exponential PCFs; (f): the averages of the empirical Wn 1 tr(gφbn ). observations support our theoretical findings in Theorem 4. Furthermore, Figure 1(b)-(c) and Table 1 show that even when the parametric family of g(s, t; θ) was mis-specified, the estimation accuracy as well as the coverage probabilities of the confidence intervals were almost not affected at all. This surprising observation can be explained by Figure 1(d)-(e), where we can see that the estimated exponential pair correlation functions, although mis-specified, were still able to capture the clustering pattern among the cases. Finally, we investigated the impact of using the preliminary estimator θ n on the statistical properties of the estimates obtained using the SQLu method. To do so, we again estimated the N β0,n and β1 using SQLu but now with the true pair correlation function g(s, t) instead of plugging in the estimated pair correlation function. The results are summarized in Figure 1(b)-(c), where 17
19 we can see that the estimated θ n indeed caused additional variability in ˆβ 1. As a result, the asymptotic standard error given in Theorem 1 slightly underestimated the standard error of ˆβ 1. However, when n = 2, the asymptotic standard error matched the empirical standard error quite well, which resulted in valid coverage probabilities of the confidence intervals in almost all cases. This confirms our theoretical finding of Theorem Misspecified models with Π(s) 1 To study the robustness of the SQLu method to model misspecification, we applied it to casecontrol point pattern data that have some departure from assumptions (1) and (2). More specifically, let X(s) be an isotropic Gaussian random field with an exponential covariance function with mean 0, variance 1 and a range parameter 10. We consider three forms of Π(s) Model I: Model II: Π(s) = 1 + q sin(2πy), for s = (x, y), ( ) Π(s) = Φ ρ q Z(s) + 1 ρ 2 qx (s), (27) Model III: Π(s) = exp { q X(s) q 2 /2 }, where X (s) denotes a single realization of X(s), q = 0.25, 0.5, 0.75, 1.0, ρ q = 0.8(q 0.25) and Φ( ) is the standard normal cumulative distribution function. Following the same estimation procedures outlined in the previous subsection and pretending Π(s) 1, summary statistics based on 1, 000 simulation runs with κ = 50, ω = 0.04 and n = 2 are presented in Table 2. Model I investigates the case when the sampling scheme α(s) is systematically misspecified and the misspecification becomes more severe as q increases. In this case, we can see from Table 2 that both CLE and SQLu might produce a biased estimator for β 1 and the biases increased as q grew. However, one noticeable feature is that the SQLu consistently produced much smaller biases than the CLE until q = 1. Furthermore, the coverage probabilities of the confidence intervals of the SQLu method appear to be reasonably good with small to moderate values of q. Under this case scenario, the SQLu appeared to be more robust than the CLE method. Model II mimics the situation when some covariate, namely, X (s), that only affects the 18
20 control process is left out. In a sense, this can also be viewed as a misspecification of the function ψ(s) in (1), which should be ψ(s, β) = exp{β0,n+z(s)β N 1 }Π 1 (s) as opposed to ψ(s, β) = exp{β0,n N + Z(s)β 1 } given in (25). In this case, it appears that when ρ q = 0 with q = 0.25, both CLE and SQLu methods still produced unbiased estimators for β 1. The estimation biases became larger as ρ q increased as expected. However, when ρ q 0, β 1 can no longer be interpreted as the elevated/reduced impact of Z(s) on case process relative to control process. Model III deals with the case scenario when even conditioned on λ 0 (s) and ψ(s), the control process is still not a Poisson process. With a new X(s) simulated for each simulation run, the control process M becomes a log-gaussian Cox process with a pair correlation function g M (s, t) = exp {q 2 exp ( s t /10)}. In this case, both CLE and SQLu methods yielded unbiased estimators for β 1 for any values of q. However, as q increased, the variances of both estimators generally increased due to additional aggregations introduced into the control process. Coverage probabilities of the resulting confidence intervals are slightly off the nominal level. Nevertheless, the SQLu estimator outperformed the CLE estimator in terms of rmse for any q in this case. 4 Data examples 4.1 Beijing restaurant locations The first data example concerns locations of two types of restaurants in Beijing, China. The data were collected from 11 districts of Beijing through the search engine The control process consisted of locations of traditional Chinese restaurants. Due to the limit on the number of restaurant locations that can be returned by the search engine, we extracted a random sample consisting of 6% of the Chinese restaurants, i.e. using a uniform sampling probability α(s) = This resulted in 2,659 control locations. The case process consisted of locations of all 1, 781 Western-style restaurants in Beijing. Figure 2(a) gives all restaurant locations, where 19
21 Table 2: Biases and rmses of the different estimators for β 1 with κ = 50, ω = 0.04 and n = 2. q = 0.25 (ρ q = 0) q = 0.5 (ρ q = 0.2) CLE SQLu CLE SQLu Model α BIAS rmse BIAS rmse CP90 BIAS rmse BIAS rmse CP90 I % % % % % % II % % % % % % III % % % % % % q = 0.75 (ρ q = 0.4) q = 1.0 (ρ q = 0.6) CLE SQLu CLE SQLu BIAS rmse BIAS rmse CP90 BIAS rmse BIAS rmse CP90 I % % % % % % II % % % % % % III % % % % % % it appears that the Western-style restaurants tended to be more concentrated than the Chinese restaurants. For model estimation, we converted the longitude/latitude locations into UTM coordinates (northing, easting) following Snyder (1987). We modeled the possible differences between the spatial patterns of Western and Chinese restaurants using ψ(s; β) = exp{β 0 + β T Z(s)}, where the covariate vector Z(s) consisted of two district level covariates: the average annual income of a regular worker (in 10, 000 RMB, Income ) and the logarithm of the total number of foreign tourists (in 10, 000, log-travel ) in The intercept β 0 was introduced in ψ(s; β) to model the overall difference in the intensities between the Western and Chinese restaurants in Beijing. To model possible clustering in the Western restaurant locations that was not explained by the intensity function, we further 20
22 (a) Restaurants locations 1.4 (b) Estimated Pair correlation function (c) Residual plot (h= 0.5) Parametric PCF Nonparametric PCF g(r) Latitude Northning (10km) Western Longitude Chinese Type distance (10 km) Easting (10km) Figure 2: (a) Locations of Restaurants; (b) Estimated pair correlation functions (c) Residuals from the reduced model (locations with UTM coordinates). introduced a parametric pair correlation function g(s, t) as defined in (26). Using the approach in Guan et al. (2008), the estimated parameters are κ = 5.65 and ω = Figure 2(b) shows that the estimated parametric pair correlation function agrees well with the nonparametric one (Guan et al. 2008) and both indicate the presence of clustering. Table 3: Regression parameter estimates for the restaurant data Method Full Model Reduced Model Intercept Income log-travel Intercept log-travel CLE Estimates (SE) -4.38(0.51) 0.050(0.079) 0.21(0.094) -4.05(0.19) 0.25(0.067) P-value SQLu Estimates (SE) -4.30(0.32) 0.028(0.045) 0.19(0.048) -4.11(0.12) 0.21(0.037) P-value Finally, the estimated regression parameters are summarized in Table 3. The covariate Income is not significant while the covariate log-travel is significant regardless of the estimation method used. The covariate Income impacts the distributions of both Chinese and Westernstyle restaurants and is therefore likely being absorbed into the baseline intensity λ0 (s). The positive parameter estimate for the covariate log-travel shows that the Western-style restaurants tended to be more concentrated (relative to the Chinese restaurants) in districts that attracted more foreign tourists. Comparing the two approaches, SQLu produced much smaller standard errors than CLE, which illustrates the potential advantage of the proposed method. To 21
23 assess the goodness of fit, we also computed standardized smoothed residuals (see Guan et al. 2008) on a grid over the the banded region in Figure 2(c) (residuals are only calculated for the 602 grid points that have at least 5 restaurants within a 5 km radius). The residuals are all of moderate magnitude and do not contradict the proposed model. Note that the apparent correlation in the residual plot is partly due to the smoothing procedure and partly due to the correlation in the point pattern data, cf. the fitted pair correlation function in Figure Tropical rain forest data The second data example concerns the spatial locations of three tropical forest tree species, Acalypha diversifolia (528 trees), Lonchocarpus heptaphyllus (836 trees) and Capparis frondosa (3299 trees), in a 1000m 500m rectangle window on the Barro Colorado Island (Condit, 1998; Hubbell et al. 1999; Hubbell et al. 2005). Guan et al. (2015) conducted a detailed investigation of the point patterns of locations of these three species and their associations with environmental variables such as elevation (dem), slope gradient (grad), and soil contents of potassium (K), mineralized nitrogen (Nmin) and phosphorus (P). All three species display certain clustering patterns modeled using parametric pair correlation functions. For each species, there is no apparent control process available to assist modeling of the underlying spatial intensity function. The purpose of this analysis is to show how the case-control methodology can be used, as described in Section 2.6, as a computationally efficient alternative to deterministic quadrature approximation when implementing quasi-likelihood for spatial point patterns (Guan et al. 2015). More specifically, we treated each species of interest as a case process separately and assumed that the case intensity function took a purely parametric form λ N (s; β) = exp{β 0 + β T Z(s)}, as assumed in Guan et al. (2015), where the covariate vector Z(s) consisted of environmental variables. Such a parametric assumption on λ N (s; β) leads to a special case of model (1) with λ 0 (s) = 1 and ψ(s; β) = exp{β 0 + β T Z(s)} and enabled us to simulate controls from a homogeneous Poisson process with a constant intensity α for the analysis of each 22
24 species. The proportional structure (1)-(2) was therefore maintained with such constructions of case and control point patterns. Furthermore, in this case, the regression parameter β should be interpreted as the elevated/reduced impacts of Z(s) on the tree location intensities relative to the complete spatial randomness when all tree locations follow a homogeneous Poisson process. For a fair comparison with Guan et al. (2015), we adopted both the selected covariates and the estimated pair correlation functions given in Guan et al. (2015); see Guan et al. (2015) for more details. The controls were simulated using increasing intensities α such that the average numbers of simulated controls W α ranged from 500 to For each given intensity α, 1, 000 independent realizations of the control process were simulated and an averaged estimator β as well as its standard error for both the CLE and the SQLu method were obtained following Corollary 1. The results are summarized in Table 4 and Figure 3, where QL stands for the quasi-likelihood approach proposed in Guan et al. (2015). We did not apply any tapering for neither the SQLu method nor the QL method. Table 4: Estimates and standard errors of the Tropical Forest data Acalypha Lonchocarpus Capparis Method W n α K Nmin P dem grad K CLE (1.22) -2.79(0.71) -0.16(0.057) 2.85(0.83) -0.88(1.05) 4.11(0.99) (1.22) -2.78(0.71) -0.16(0.057) 2.84(0.83) -0.98(1.05) 4.15(0.99) (1.23) -2.74(0.72) -0.16(0.057) 2.86(0.83) -0.97(1.06) 4.19(0.99) (1.24) -2.75(0.73) -0.16(0.058) 2.86(0.84) -1.03(1.07) 4.19(0.99) SQLu (1.22) -2.77(0.70) -0.15(0.056) 2.74(0.81) -1.05(1.00) 4.03(0.96) (1.22) -2.74(0.70) -0.15(0.056) 2.67(0.80) -1.27(0.98) 4.05(0.95) (1.23) -2.72(0.70) -0.14(0.056) 2.57(0.80) -1.43(0.96) 4.05(0.94) (1.23) -2.72(0.70) -0.14(0.055) 2.45(0.79) -1.70(0.95) 4.01(0.94) QL(100 50) N/A 4.39(1.22) -2.77(0.70) -0.15(0.055) 2.29(0.79) -1.88(0.94) 4.04(0.94) Table 4 shows that the estimates for both Acalypha and Lonchocarpus are very similar for all approaches. This is because the pair-correlation functions drop quickly, see Figure 3(e). On the other hand, for Capparis, where the pair correlation function decays much slower, see Figure 3(f), SQLu and QL produced very different estimates from those obtained with CLE. One noticeable feature is that as α increased, the estimated coefficient of grad as well as the 23
25 associated standard error decreased for the SQLu method. To give a better idea of the efficiency of each method, Figure 3 shows the efficiency of CLE/SQLu relative to QL (using grid points) as a function of W n α. The CLE method is almost always less efficient than SQLu and QL. On the contrary, for the SQLu approach, the standard errors quickly reached the same level as the approximately optimal QL method as α increased. The SQLu method maybe more computationally scalable because (a) much less control locations were needed in all three examples to reach similar standard errors as the QL method, which relied on 5, 000 quadrature points for all cases and (b) the computation of the averaged-sqlu estimate can be easily parallelized. (a) Acalypha (b) Lonchocarpus (c) Capparis Relative Efficiency K (CCU) K (CLE) Relative Efficiency Nmin (CCU) P (CCU) Nmin (CLE) P (CLE) Relative Efficiency Dem (CCU) grad (CCU) K (CCU) Dem (CLE) grad (CLE) K (CLE) W nα W nα W nα (d) Estimated PCF (Acalypha) (e) Estimated PCF (Lonchocarpus) (f) Estimated PCF (Capparis) g(r) g(r) g(r) distance (km) distance (km) distance (km) Figure 3: Top panels: Relative efficiency defined as the standard error of CLE or SQLu divided by the standard error of QL estimators; Bottom panels: estimated pair correlation functions. 5 Asymptotic properties In this section, we first study the asymptotic properties of the estimator β obtained using the estimating function (17). Then we show that under certain conditions, this estimator is asymp- 24
Spatial analysis of tropical rain forest plot data
Spatial analysis of tropical rain forest plot data Rasmus Waagepetersen Department of Mathematical Sciences Aalborg University December 11, 2010 1/45 Tropical rain forest ecology Fundamental questions:
More informationEstimating functions for inhomogeneous spatial point processes with incomplete covariate data
Estimating functions for inhomogeneous spatial point processes with incomplete covariate data Rasmus aagepetersen Department of Mathematics Aalborg University Denmark August 15, 2007 1 / 23 Data (Barro
More informationDecomposition of variance for spatial Cox processes
Decomposition of variance for spatial Cox processes Rasmus Waagepetersen Department of Mathematical Sciences Aalborg University Joint work with Abdollah Jalilian and Yongtao Guan December 13, 2010 1/25
More informationOn Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes
On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes Yongtao Guan July 31, 2006 ABSTRACT In this paper we study computationally efficient procedures to estimate the second-order parameters
More informationSecond-Order Analysis of Spatial Point Processes
Title Second-Order Analysis of Spatial Point Process Tonglin Zhang Outline Outline Spatial Point Processes Intensity Functions Mean and Variance Pair Correlation Functions Stationarity K-functions Some
More informationDecomposition of variance for spatial Cox processes
Decomposition of variance for spatial Cox processes Rasmus Waagepetersen Department of Mathematical Sciences Aalborg University Joint work with Abdollah Jalilian and Yongtao Guan November 8, 2010 1/34
More informationESTIMATING FUNCTIONS FOR INHOMOGENEOUS COX PROCESSES
ESTIMATING FUNCTIONS FOR INHOMOGENEOUS COX PROCESSES Rasmus Waagepetersen Department of Mathematics, Aalborg University, Fredrik Bajersvej 7G, DK-9220 Aalborg, Denmark (rw@math.aau.dk) Abstract. Estimation
More informationA Thinned Block Bootstrap Variance Estimation. Procedure for Inhomogeneous Spatial Point Patterns
A Thinned Block Bootstrap Variance Estimation Procedure for Inhomogeneous Spatial Point Patterns May 22, 2007 Abstract When modeling inhomogeneous spatial point patterns, it is of interest to fit a parametric
More informationVariance Estimation for Statistics Computed from. Inhomogeneous Spatial Point Processes
Variance Estimation for Statistics Computed from Inhomogeneous Spatial Point Processes Yongtao Guan April 14, 2007 Abstract This paper introduces a new approach to estimate the variance of statistics that
More informationChapter 2. Poisson point processes
Chapter 2. Poisson point processes Jean-François Coeurjolly http://www-ljk.imag.fr/membres/jean-francois.coeurjolly/ Laboratoire Jean Kuntzmann (LJK), Grenoble University Setting for this chapter To ease
More informationRESEARCH REPORT. A note on gaps in proofs of central limit theorems. Christophe A.N. Biscio, Arnaud Poinas and Rasmus Waagepetersen
CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING 2017 www.csgb.dk RESEARCH REPORT Christophe A.N. Biscio, Arnaud Poinas and Rasmus Waagepetersen A note on gaps in proofs of central limit theorems
More informationLecture 2: Poisson point processes: properties and statistical inference
Lecture 2: Poisson point processes: properties and statistical inference Jean-François Coeurjolly http://www-ljk.imag.fr/membres/jean-francois.coeurjolly/ 1 / 20 Definition, properties and simulation Statistical
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationA Framework for Daily Spatio-Temporal Stochastic Weather Simulation
A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric
More informationA Bivariate Point Process Model with Application to Social Media User Content Generation
1 / 33 A Bivariate Point Process Model with Application to Social Media User Content Generation Emma Jingfei Zhang ezhang@bus.miami.edu Yongtao Guan yguan@bus.miami.edu Department of Management Science
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts
ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes
More informationMonte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More informationModel Selection for Geostatistical Models
Model Selection for Geostatistical Models Richard A. Davis Colorado State University http://www.stat.colostate.edu/~rdavis/lectures Joint work with: Jennifer A. Hoeting, Colorado State University Andrew
More informationBayesian Hierarchical Models
Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry
More informationQuasi-likelihood for Spatial Point Processes
Quasi-likelihood for Spatial Point Processes Yongtao Guan Miami, USA Abdollah Jalilian Kermanshah, Iran Rasmus aagepetersen Aalborg, Denmark Summary. Fitting regression models for intensity functions of
More informationForecasting Data Streams: Next Generation Flow Field Forecasting
Forecasting Data Streams: Next Generation Flow Field Forecasting Kyle Caudle South Dakota School of Mines & Technology (SDSMT) kyle.caudle@sdsmt.edu Joint work with Michael Frey (Bucknell University) and
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationBasics of Point-Referenced Data Models
Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic
More informationEstimating functions for inhomogeneous spatial point processes with incomplete covariate data
Estimating functions for inhomogeneous spatial point processes with incomplete covariate data Rasmus aagepetersen Department of Mathematical Sciences, Aalborg University Fredrik Bajersvej 7G, DK-9220 Aalborg
More informationLikelihood and p-value functions in the composite likelihood context
Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining
More informationarxiv: v2 [math.st] 20 Jun 2014
A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun
More informationSpatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields
Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February
More informationREGULARIZED ESTIMATING EQUATIONS FOR MODEL SELECTION OF CLUSTERED SPATIAL POINT PROCESSES
Statistica Sinica 25 (2015), 173-188 doi:http://dx.doi.org/10.5705/ss.2013.208w REGULARIZED ESTIMATING EQUATIONS FOR MODEL SELECTION OF CLUSTERED SPATIAL POINT PROCESSES Andrew L. Thurman 1, Rao Fu 2,
More informationHierarchical Modelling for Univariate and Multivariate Spatial Data
Hierarchical Modelling for Univariate and Multivariate Spatial Data p. 1/4 Hierarchical Modelling for Univariate and Multivariate Spatial Data Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationIntroduction to Geostatistics
Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,
More informationGaussian predictive process models for large spatial data sets.
Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview
More informationPoint process with spatio-temporal heterogeneity
Point process with spatio-temporal heterogeneity Jony Arrais Pinto Jr Universidade Federal Fluminense Universidade Federal do Rio de Janeiro PASI June 24, 2014 * - Joint work with Dani Gamerman and Marina
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,
More informationComputationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models
Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling
More informationMarginal Specifications and a Gaussian Copula Estimation
Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required
More informationModels for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data
Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise
More informationInference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies
Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies Thierry Duchesne 1 (Thierry.Duchesne@mat.ulaval.ca) with Radu Craiu,
More informationNonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University
Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationCasuality and Programme Evaluation
Casuality and Programme Evaluation Lecture V: Difference-in-Differences II Dr Martin Karlsson University of Duisburg-Essen Summer Semester 2017 M Karlsson (University of Duisburg-Essen) Casuality and Programme
More informationHierarchical Modelling for Multivariate Spatial Data
Hierarchical Modelling for Multivariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Point-referenced spatial data often come as
More informationOn prediction and density estimation Peter McCullagh University of Chicago December 2004
On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating
More informationAALBORG UNIVERSITY. An estimating function approach to inference for inhomogeneous Neyman-Scott processes. Rasmus Plenge Waagepetersen
AALBORG UNIVERITY An estimating function approach to inference for inhomogeneous Neyman-cott processes by Rasmus Plenge Waagepetersen R-2005-30 eptember 2005 Department of Mathematical ciences Aalborg
More informationGenerating Spatial Correlated Binary Data Through a Copulas Method
Science Research 2015; 3(4): 206-212 Published online July 23, 2015 (http://www.sciencepublishinggroup.com/j/sr) doi: 10.11648/j.sr.20150304.18 ISSN: 2329-0935 (Print); ISSN: 2329-0927 (Online) Generating
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT
ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT Rachid el Halimi and Jordi Ocaña Departament d Estadística
More informationQuasi-likelihood Scan Statistics for Detection of
for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December 2011 1 / 25 Outline for
More informationMultivariate spatial modeling
Multivariate spatial modeling Point-referenced spatial data often come as multivariate measurements at each location Chapter 7: Multivariate Spatial Modeling p. 1/21 Multivariate spatial modeling Point-referenced
More informationOpen Problems in Mixed Models
xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For
More informationarxiv: v2 [stat.me] 8 Jun 2016
Orthogonality of the Mean and Error Distribution in Generalized Linear Models 1 BY ALAN HUANG 2 and PAUL J. RATHOUZ 3 University of Technology Sydney and University of Wisconsin Madison 4th August, 2013
More informationAnalysis of Marked Point Patterns with Spatial and Non-spatial Covariate Information
Analysis of Marked Point Patterns with Spatial and Non-spatial Covariate Information p. 1/27 Analysis of Marked Point Patterns with Spatial and Non-spatial Covariate Information Shengde Liang, Bradley
More informationLeast Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates
Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationENGRG Introduction to GIS
ENGRG 59910 Introduction to GIS Michael Piasecki October 13, 2017 Lecture 06: Spatial Analysis Outline Today Concepts What is spatial interpolation Why is necessary Sample of interpolation (size and pattern)
More informationBias-Correction in Vector Autoregressive Models: A Simulation Study
Econometrics 2014, 2, 45-71; doi:10.3390/econometrics2010045 OPEN ACCESS econometrics ISSN 2225-1146 www.mdpi.com/journal/econometrics Article Bias-Correction in Vector Autoregressive Models: A Simulation
More informationIntroduction. Spatial Processes & Spatial Patterns
Introduction Spatial data: set of geo-referenced attribute measurements: each measurement is associated with a location (point) or an entity (area/region/object) in geographical (or other) space; the domain
More informationDecomposition of variance for spatial Cox processes Jalilian, Abdollah; Guan, Yongtao; Waagepetersen, Rasmus Plenge
Aalborg Universitet Decomposition of variance for spatial Cox processes Jalilian, Abdollah; Guan, Yongtao; Waagepetersen, Rasmus Plenge Publication date: 2011 Document Version Early version, also known
More informationExtreme Value Analysis and Spatial Extremes
Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models
More informationOn block bootstrapping areal data Introduction
On block bootstrapping areal data Nicholas Nagle Department of Geography University of Colorado UCB 260 Boulder, CO 80309-0260 Telephone: 303-492-4794 Email: nicholas.nagle@colorado.edu Introduction Inference
More informationHierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More information6.435, System Identification
System Identification 6.435 SET 3 Nonparametric Identification Munther A. Dahleh 1 Nonparametric Methods for System ID Time domain methods Impulse response Step response Correlation analysis / time Frequency
More informationChapter 2 Inference on Mean Residual Life-Overview
Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate
More informationForecasting Levels of log Variables in Vector Autoregressions
September 24, 200 Forecasting Levels of log Variables in Vector Autoregressions Gunnar Bårdsen Department of Economics, Dragvoll, NTNU, N-749 Trondheim, NORWAY email: gunnar.bardsen@svt.ntnu.no Helmut
More informationAn adapted intensity estimator for linear networks with an application to modelling anti-social behaviour in an urban environment
An adapted intensity estimator for linear networks with an application to modelling anti-social behaviour in an urban environment M. M. Moradi 1,2,, F. J. Rodríguez-Cortés 2 and J. Mateu 2 1 Institute
More informationA note on profile likelihood for exponential tilt mixture models
Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationAn application of the GAM-PCA-VAR model to respiratory disease and air pollution data
An application of the GAM-PCA-VAR model to respiratory disease and air pollution data Márton Ispány 1 Faculty of Informatics, University of Debrecen Hungary Joint work with Juliana Bottoni de Souza, Valdério
More informationWrapped Gaussian processes: a short review and some new results
Wrapped Gaussian processes: a short review and some new results Giovanna Jona Lasinio 1, Gianluca Mastrantonio 2 and Alan Gelfand 3 1-Università Sapienza di Roma 2- Università RomaTRE 3- Duke University
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationKneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"
Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationSummary statistics for inhomogeneous spatio-temporal marked point patterns
Summary statistics for inhomogeneous spatio-temporal marked point patterns Marie-Colette van Lieshout CWI Amsterdam The Netherlands Joint work with Ottmar Cronie Summary statistics for inhomogeneous spatio-temporal
More informationBayesian Analysis of Latent Variable Models using Mplus
Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are
More informationInference For High Dimensional M-estimates: Fixed Design Results
Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationComparing Non-informative Priors for Estimation and Prediction in Spatial Models
Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial
More informationSimulating Uniform- and Triangular- Based Double Power Method Distributions
Journal of Statistical and Econometric Methods, vol.6, no.1, 2017, 1-44 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2017 Simulating Uniform- and Triangular- Based Double Power Method Distributions
More informationSpatial Misalignment
Spatial Misalignment Jamie Monogan University of Georgia Spring 2013 Jamie Monogan (UGA) Spatial Misalignment Spring 2013 1 / 28 Objectives By the end of today s meeting, participants should be able to:
More informationImproving the travel time prediction by using the real-time floating car data
Improving the travel time prediction by using the real-time floating car data Krzysztof Dembczyński Przemys law Gawe l Andrzej Jaszkiewicz Wojciech Kot lowski Adam Szarecki Institute of Computing Science,
More informationFusing point and areal level space-time data. data with application to wet deposition
Fusing point and areal level space-time data with application to wet deposition Alan Gelfand Duke University Joint work with Sujit Sahu and David Holland Chemical Deposition Combustion of fossil fuel produces
More informationMeasuring Social Influence Without Bias
Measuring Social Influence Without Bias Annie Franco Bobbie NJ Macdonald December 9, 2015 The Problem CS224W: Final Paper How well can statistical models disentangle the effects of social influence from
More informationStatistical inference on Lévy processes
Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline
More informationImpact of serial correlation structures on random effect misspecification with the linear mixed model.
Impact of serial correlation structures on random effect misspecification with the linear mixed model. Brandon LeBeau University of Iowa file:///c:/users/bleb/onedrive%20 %20University%20of%20Iowa%201/JournalArticlesInProgress/Diss/Study2/Pres/pres.html#(2)
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR
More informationHierarchical Modelling for Univariate Spatial Data
Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.
More informationOn the errors introduced by the naive Bayes independence assumption
On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of
More informationConservative variance estimation for sampling designs with zero pairwise inclusion probabilities
Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some
More informationA Non-parametric bootstrap for multilevel models
A Non-parametric bootstrap for multilevel models By James Carpenter London School of Hygiene and ropical Medicine Harvey Goldstein and Jon asbash Institute of Education 1. Introduction Bootstrapping is
More information