Fitting Large-Scale Spatial Models with Applications to Microarray Data Analysis
|
|
- Rodney Bridges
- 6 years ago
- Views:
Transcription
1 Fitting Large-Scale Spatial Models with Applications to Microarray Data Analysis Stephan R Sain Department of Mathematics University of Colorado at Denver Denver, Colorado ssain@mathcudenveredu Reinhard Furrer Geophysical Statistics Project National Center for Atmospheric Research Boulder, Colorado furrer@ucaredu Many problems in the environmental and biological sciences involve the analysis of large quantities of data Further, the data in these problems are often subject to various types of structure and, in particular, spatial dependence Traditional model fitting often fails due to the size of the datasets since it is difficult to not only specify but also to compute with the full covariance matrix For example, a single microarray can include over 400,000 individual observations We propose using a very general type of mixed model that has a random spatial component Recognizing that spatial covariance matrices often exhibit a large number of zero or near-zero entries, covariance tapering is used to force near-zero entries to zero Then, taking advantage of the sparse nature of such tapered covariance matrices, backfitting is used to estimate the fixed and random model parameters Results will be demonstrated on a experiment using microarrays to build a profile of differentially expressed genes relating to cerebral vascular malformations, an important cause of hemorrhagic stroke and seizures Keywords: Mixed effects; Backfitting; Covariance Tapering; Sparse matrices 1 Introduction Many spatial problems are inherently multivariate with more than one measurement or observation at each spatial location Moreover, many spatial problems involve a large number of spatial locations This leads to serious computational difficulties in constructing, storing, and manipulating very large regression and covariance matrices Such problems arise in a number of areas from traditional environmental statistics and epidemiology to new approaches to biological problems For example, the authors present research in this area include combining observed climate data and climate models to examine climate model behavior as well as predictions of climate change In this setting, there are two variables, precipitation and temperature, for sixteen different models on a 5 grid resulting in observations per climate model Of particular interest in this paper is a problem of considerable current interest, namely analyzing microarray data for biological experiments In this case, we are attempting to build a profile of differentially expressed genes relating to cerebral vascular malformation (Shenkar et al, 2003) In this study, there are roughly twenty gene chips with three disease groups (control and two disease states) and with each chip basically a array of approximately 400,000 observations We propose a simple, multivariate, additive (mixed-effects) spatial model and discuss some strategies for fitting such models and estimating model parameters 1
2 2 SAIN AND FURRER when the size of the data structures are large There are two key aspects First, recognizing that many if not most of the elements of the spatial covariance matrices are zero or near-zero, covariance tapering is used to force near-zero entries to zero which introduces a great deal of sparseness in the covariance matrices This sparseness allows such matrices to be stored and manipulated more efficiently Second, the additive structure in the model is exploited using a backfitting algorithm for parameter estimation The next section develops the model in detail while Section 3 discusses several computational issues when using huge datasets with backfitting algorithms Section 4 shows qualitative and quantitative results of a small example using microarray data analysis Finally, we discuss in Section 5 current and future research of this longterm project 2 A Multivariate, Additive Spatial Model A simple, multivariate, additive (mixed-effects) spatial model for an observation vector Y can be written as Y = Xβ + h + ɛ, (1) where Xβ represent fixed effects; h represents a random, zero-mean spatial process with Var(h) = Σ h ; ɛ represents a random, zero-mean error process with Var(ɛ) = Σ ɛ, orthonormal to h Model (1) is generic; in the case of gene expression on a chip, the fixed effects can be expanded to Y = β mean + Rβ row + Cβ col + Gβ Gene + h + ɛ (2) From a biological point of view, β Gene is the quantity of interest whereas the chip specific effects β Chip = [ β mean, β T row, β T col ]T are ancillary and are included to account for any chip specific, large-scale trends observed in the data It is convenient to parameterize the spatial covariance matrix by θ (Σ h = K(θ)) and to assume a white noise measurement error (Σ ɛ = σ 2 I) Suppose we have k different chips having an identical gene layout Then, model (1) is expanded to F 0 0 G Y 1 Y k = 0 F F G β Chip 1 β Chip k β Gene + h Chip 1 h Chip k + ɛ Chip 1 ɛ Chip k (3) with F = [ 1, R, C ] and where the spatial processes h Chip i are mutually independent The covariance matrices of the spatial and the random process are assumed to take the forms K σ 2 1I 0 0 Σ h = 0 K 2 0 and Σ ɛ = 0 σ 2 2I 0, (4) 0 0 K k 0 0 σk 2I
3 FITTING LARGE-SCALE SPATIAL MODELS 3 where K i = K(θ i ) represents a chip specific spatial covariance matrix parameterized by θ i ; are chip specific variances, called nugget effect in geostatistical literature σ 2 i The independence assumption across chips is justified by the fact that, typically, the chips are based on unique tissue samples, often from different individuals The chip specific fixed (β Chip i ) and random, spatial (h Chip i ) effects are included to account for chip specific large-scale and small-scale spatial trends This type of structure is able to model non-linear relationships in much the same fashion as smoothing splines (Nychka, 2000) Note that this model can also be written in an additive fashion, separating chip specific and gene specific effects For small samples, one could use ML or REML (Kitanidis, 1997; Stein, 1999) to fit covariance parameters, estimates of β and predictions of the random effects follow directly: β = (X T V 1 X) 1 X T V 1 Y (generalized least-squares) (5) ĥ = Σ h V 1 (Y X β) where V = Σ h + Σ ɛ These estimates are equivalent to the universal kriging solutions, ie the best linear unbiased predictor (eg Cressie, 1993) In our setting, direct computation with the design and covariance matrices is impossible as the observation vector Y, even with only one or two chips is too big We solve this problem with backfitting algorithms outlined below 21 Backfitting with One Chip Backfitting procedures are widely used in additive or generalized linear/additive models, eg Breiman and Friedman (1985); Buja et al (1989) Applied to equation (1), the backfitting algorithm consists of estimating iteratively the fixed effects β (regression step) and the spatial process h (kriging step), as schematized below [1] Let ĥ(0) be an initial guess and put j = 1 [2] β(j) = ( X T X ) 1 X T ( Y ĥ(j 1)) [3] Estimate covariance parameters to get θ (j) and σ 2(j), then put ĥ (j) = Σ ( V 1 h Y X β (j)) [4] Put j = j + 1 and repeat [2] and [3] until convergence To prove equivalence after convergence, plug [3] into [2] and a few straightforward manipulations lead to the generalized least-squares estimator (5) The convergence criterion in step [4] should be based on the estimates β (j) (j), θ and σ 2(j), for example, absolute or relative mean squared differences The algorithm usually converges in a few steps We will come back to this issue in Section 4 In the setting of a single microarray chip, the design matrix X is too big to compute with for available computing resources We use therefore equation (2) to separate the different fixed effects and perform the regression step [2] iteratively on the chip specific effects β Chip and the gene effects β Gene Thus, step [2] becomes:
4 4 SAIN AND FURRER [2a] (0) Let β Gene be an initial guess and put l = 1 [2b] β(l) Chip = ( F T F ) 1 ( F T Y ĥ(j 1) G β (l 1) ) Gene [2c] β(l) Gene = ( G T G ) 1 ( G T Y ĥ(j 1) F β (l) ) Chip [2d] Put, l = l + 1 and repeat [2b] and [2c] until convergence, then β (j) = ( β(l 1) Chip T (l 1), β T) T Gene 22 Backfitting with Several Chips Suppose we have k different chips According to equation (3) we extend the backfitting algorithm presented in the last section As there is no spatial structure between different chips (cf equation (4)), we estimate and fit the spatial structure on each chip separately In a similar way, the chip specific effects depend only on the observations of the corresponding chip Only the gene effects have to be considered across all observations It can be shown that they can be fitted by taking the mean of the centered observations Z i = Y Chip i h (j 1) Chip i F β(l) Chip i Therefore, the design matrices are identical to the case of a single chip This yields the modified backfitting algorithm below [1 ] Let ĥ(0) Chip i, i = 1,, k, be an initial guess and put j = 1 [2a ] Let β (0) Gene be an initial guess and put l = 1 [2b ] For i = 1,, k, β (l) Chip i = ( F T F ) 1 ( F T Y Chip i ĥ(j 1) Chip i [2c ] Let Z = 1 k k i=1 Y Chip i ĥ(j 1) Chip i β (l) Gene = ( G T G ) 1 G T Z (l 1)) G β Gene (l) F β Chip i, then put [2d ] Put, l = l + 1 and repeat [2b ] and [2c ] until convergence, then β (j) = ( β(l 1) Chip [3 ] For i = 1,, k, T (l 1), β T) T Gene estimate covariance parameters to get ĥ (j) θ (j) i and σ 2 i (j), then put Chip i = K i( θ (j) i ) ( K i ( θ (j) i ) + σ i 2(j) I ) 1( (j) (j) YChip i F β Chip i G β Gene [4 ] Put j = j + 1 and repeat [2a ] to [3 ] until convergence ) This backfitting algorithm uses essentially the same amount of storage as the algorithm for a single chip only Note that the computing time of step [2c ] is comparable with step [2c] Whereas steps [2b ] and [3 ] are k times as expensive as the respective steps in the single chip case
5 FITTING LARGE-SCALE SPATIAL MODELS 5 3 Computational Issues One of the aims of this study was to see whether this kind of analysis could be done with existing software on a reasonable sized desktop computer We decided to use the freely available computer software R (Ihaka and Gentleman, 1996; R Development Core Team, 2004) with a RedHat Linux system and 2 Gbytes of RAM 31 Sparse Matrices The design matrices F and G contain as entires ±1 and a vast amount of zeros If such huge matrices contain only a small percentage of nonzero elements, it is advantageous to use more complex storing methods than a simple double indexed array One commonly used structure consists of using three vectors, where the first contains the nonzero elements, the second the column indexes of the elements stored in the first, and the last pointers to the beginning of each matrix row in the first two vectors For a matrix with z nonzero elements we thus need z reals and z + n + 1 integers compared to n n reals (eg George and Liu, 1981, see also Table 1 as explained in Section 33) The R package SparseM (Koenker and Ng, 2003) contains a few rudimentary functions for handling sparse matrices We used their concept of representing sparse matrices and wrote the backfitting procedure in a linear, sequential way, calling as few functions as possible in order to save memory Computationally expensive blocks, such as the construction of the design matrices are coded in Fortran 77 The coding is similar to the functions given in Furrer (2004) 32 Covariance Tapering In the backfitting algorithm, steps [3] or [3 ] are best unbiased linear predictions (BLUP) of a spatial field, also called simple kriging in geostatistical literature The BLUP essentially requires solving the huge linear system Vx = Z, where Z contains centered observations Computationally, we first perform a Cholesky factorization L T L = V and then successively solve the triangular systems Lw = Z and L T x = w, giving x = V 1 Z Typical covariance structures imply full matrices V Tapering the covariance function with some positive definite, compactly supported function induces a sparseness structure in V and preserves asymptotic optimality (Furrer et al, 2004) The taper range determines the degree of approximation but also the sparseness of V As a rule of thumb, Furrer et al (2004) recommend to use points within the taper range In our setting, we cannot meet this proposal because of memory limitations With taper distance 2 < η 2 (ie 8 points) the Cholesky factor of K i contains a number of nonzero elements of the order 10 6 However, with 2 < η 3 (ie points) the Cholesky factor of K i contains a number of nonzero elements of the order 10 8 We will therefore use a taper length of 2, leading to 3,614,762 (0002%) and 67,070,820 (0040%) nonzero elements in the covariance matrix and its Cholesky factor, respectively, for the single chip case We suppose that the spatial process is stationary and isotropic such that the ijth element of the covariance matrices is given by positive definite function k(h; θ 1, θ 2 ), where h is the distance between observation i and j The parameter θ 1 = k(0) is called the sill and θ 2 is the range parameter, responsible for the rate at which the covariance decays
6 6 SAIN AND FURRER 33 Choice of Contrasts The design matrices are sparse, but the choice of the contrasts determines to what extent Therefore, this choice is crucial to our objective As an illustration, Table 1 gives the percentages of nonzero elements for sum and treatment contrasts for one chip Table 1: Sparseness for different contrasts Percentages of nonzero elements compared to a full matrix Note that X has more than elements For two cases only lower bounds can be given due to limited RAM F F T F G G T G X X T X Treatment Sum > > 6626 As we decouple the chip and gene effects in the regression step of the backfitting algorithm (steps [2b ] and [2c ]), we can switch between different contrasts for each of those effects For interpretability reasons, we choose for F sum and for G treatment contrasts Covariance tapering and the additional iteration of the gene and chip effects (steps [2a,,d] and [2,,d ]), cuts the computational and storage cost considerably However, 2 Gbytes of RAM are not sufficient for our application to keep all matrices permanently in memory Hence, for each regression and kriging step we construct the individual design and covariance matrices and eliminate them afterwards 34 Standard Errors To calculate standard errors of the parameter β, we could simply use equation (5) to deduct Var( β) = (X T V 1 X) 1 Var(Y) However, this variance cannot be calculated directly, since the matrices are too big In the case of one chip, one could simplify the expression to Var( β Gene ) = ( (G T G) 1 G T F(F T F) 1 F T G ) Var(Y) With our existing computing resources, it is still not possible to evaluate this quantity We therefore use the simplistic approximation Var( β Gene ) ( θ2 1 + σ 2) (G T G) 1 (6) 4 Application The raw data from the microarray chips was rounded to the nearest 1/4 We therefore blurred the data prior to our analysis with a white noise according a uniform ( 0125, 0125) variable Then we took the logarithm and subtracted the mean Those transformed observations were plugged into the backfitting algorithms previously outlined The next two sections present the results for a single and a double chip model
7 FITTING LARGE-SCALE SPATIAL MODELS 7 Figure 1: Two-dimensional empirical covariance function for single chip example 41 Single Chip Results Figure 1 shows the two-dimensional empirical covariance function after fitting, confirming strongly isotropy of the spatial process To our knowledge there is no a priori reason that the spatial covariance has a particular structure Given the gridded data, empirical covariance estimates are not able to refute a linear behavior of the covariance function at the origin (cf Figure 2) We suppose an exponential covariance structure and use a spherical taper with a fixed taper range of 2 (being the maximum possible value computationally feasible) The resulting covariance function for K 1 can be written as k(h; θ 1, θ 2 ) = θ 1 exp ( h )(1 3h ) θ 2 4 h3 16 Figure 2 shows the ordinary least squares fits with θ 1 = 0487, θ 2 = 1528 and nugget effect σ 2 = 0061 As presumed, the backfitting algorithm converges quickly, MSE ( β(j) β (j 1)) < 10 4 for j 4 Figure 3 shows how quickly a few randomly selected coefficients converge Table 2 gives the required computing time on a Linux powered 26 GHz Xeon processor with 2 Gbytes of RAM Figure 6 displays the row and column effects (the color bar was taken over the range of the displayed data) The top panel reflects the fact that perfect match and miss match are on alternating rows The column effects might indicate a small trend with higher values at the right But the row and column effects are small compared x + x o + x + x o empirical horizontal empirical vertical empirical off axis fitted exponential tapered covariance: exp*spher taper covariance o o o + x o o oo o o o o + x o o o o o o + x o o o o + x o + xo lag Figure 2: Empirical and fitted covariances for single chip example
8 8 SAIN AND FURRER Iteration Iteration Figure 3: Convergence of randomly selected fixed effects parameters (row, column effects left panel, gene effects right panel) for single chip example to the spatial process (Figure 7, top) The observations are the sum of the chip specific effects (Figure 7, bottom), the gene effects and the residuals (Figure 8, top and bottom) The residuals indicate, that all the structure could be explained with the fixed effects and a spatial process as there is no spatial structure or pattern left Figure 4 shows the effects between perfect match and miss match, which are slightly positively skewed We tried to normalize the effects using the approximation (6) However, little difference is observed and there are a larger number of normalized effects beyond the typical threshold values of ±2 42 Double Chip Results The algorithm was applied to two chips based on different samples from a single individual Figure 9 shows the differences between the chip effects and spatial processes in the case of two chips Although the differences between the chip effects are small compared to the fitted effects, they exhibit an interesting pattern Chip Table 2: Computing time for different steps in the backfitting algorithm in R The values represent the mean of the iterations (Linux, 26 GHz Xeon processor with 2 Gbytes of RAM) Action Time (sec) Read data, variable setup 559 Create the matrix F T F 1444 Solve F T Fx = b 161 Create the matrix G T G 1776 Solve G T Gx = b 308 Total typical regression step (4 iterations) 4528 Estimate covariance parameters 1219 Create the matrix Σ h 1048 Solve Σ h x = b 4567 Total typical kriging step 7551 Total backfitting (4 iterations) 50623
9 FITTING LARGE-SCALE SPATIAL MODELS 9 Miss match Miss match Perfect match Perfect match Figure 4: Gene effects between perfect match and miss match for single chip example (the right panel gives the normalized effects) The horizontal and vertical lines are the means The red and blue curves are smoothed histograms The dotted curves are superimposed normal densities 2 has almost exclusively bigger effects The spatial difference shows some rather large blotches suggesting substantial differences in chip specific large-scale trends This emphasizes that large and small-scale chip specific effects and trends can be modeled and extracted Figure 5 compares the fitted gene effects obtained from the analysis with a single chip only and from taking account of both Reassuringly, there do not seem to be substantial differences in the gene effects across the two chips from the same individual Effects with two chips Effects with one chip Figure 5: Comparison of the fitted gene effects with the single chip and the double chip model
10 10 SAIN AND FURRER 5 Discussion and Outlook Our goal in this project was essentially a proof-of-concept to establish that traditional additive, mixed-effects models for multivariate spatial data could be used to analyze large-scale data problems such as those posed in the environmental and biological sciences We now begin serious application of these methods, in particular to the microarray data from experiments associated with cerebral vascular malformations (among others) More specifically, we seek a more detailed analysis, including the examination of genes labeled as differentially expressed and the comparison with the results from more established methods Moreover, we seek to examine the differences in differentially expressed genes for the different disease groups in our study This will involve additional modifications to the design matrices However, we do not perceive this to be a serious complication and the backfitting algorithms can easily be modified to account for these changes Our models currently assume constant means across the probe-level data for each specific gene on the gene chip There seems to be evidence, both from our own empirical studies and in the the biological literature, that this is not the case We are exploring improved models to account for this additional structure in the data Finally, there are additional computational improvements currently being examined We are exploring computing environments that do not have the memory limitations of 2 Gbytes of RAM We are also exploring ways of imposing less-severe tapering of the spatial covariances in order to approach the more optimal conditions discussed in Furrer et al (2004) In addition, the fairly regular lattices observed in microarray data lead to a particular sparse structure in the Cholesky factor Our preliminary experiments suggest could be exploited to dramatically improve computational performance Acknowledgments The authors would like to thank Professor Isam Awad and Robert Shenkar (Department of Neurological Surgery, Feinberg School of Medecine, Northwestern University) as well as Edith Creek (Department of Mathematics, University of Colorado at Denver) for providing the data and answering our numerous questions The research of the first author was supported in part by a grant from the University of Colorado Genome-Biotechnology Initiative The research of both the first and second authors was supported in part by the Geophyical Statistics Project at the National Center for Atmospheric Research under the National Science Foundation grant DMS References Breiman, L and Friedman, J H (1985) Estimating optimal transformations for multiple regression and correlations (with discussion) Journal of the American Statistical Association, 80, Buja, A, Hastie, T J, and Tibshirani, R J (1989) Linear smoothers and additive models (with discussion) Annals of Statistics, 17, Cressie, N A C (1993) Statistics for Spatial Data John Wiley & Sons Inc, New York, revised reprint 3
11 FITTING LARGE-SCALE SPATIAL MODELS 11 Furrer, R (2004) KriSp: An R package for Covariance Tapered Kriging of Large Datasets Using Sparse Matrix Techniques Software/KriSp/ 5 Furrer, R, Genton, M G, and Nychka, D (2004) Covariance Tapering for Interpolation of Large Spatial Datasets Submitted to Journal of Computational and Graphical Statistics 5, 10 George, A and Liu, J W H (1981) Computer solution of large sparse positive definite systems Prentice-Hall Inc, Englewood Cliffs, N J 5 Ihaka, R and Gentleman, R (1996) R: A language for data analysis and graphics Journal of Computational and Graphical Statistics, 5, Kitanidis, P K (1997) Introduction to Geostatistics: Applications in Hydrogeology Cambridge University Press 3 Koenker, R and Ng, P (2003) SparseM: Sparse Matrix Package for R 5 Nychka, D W (2000) Spatial-process estimates as smoothers In Schimek, M G, editor, Smoothing and Regression: Approaches, Computation, and Application, chapter 13, John Wiley & Sons Inc, New York 3 R Development Core Team (2004) R: A language and environment for statistical computing R Foundation for Statistical Computing, Vienna, Austria 5 Shenkar, R, Elliott, J P, Diener, K, Gault, J, Hu, L, Cohrs, R J, Phang, T, Hunter, L, Breeze, R E, and Awad, I A (2003) Differential gene expression in human cerebrovascular malformations (with discussion) Neurosurgery, 52, Stein, M L (1999) Interpolation of Spatial Data Springer-Verlag, New York 3
12 12 SAIN AND FURRER Figure 6: Row (top) and column (bottom) effects for single chip example (back to text)
13 FITTING LARGE-SCALE SPATIAL MODELS 13 Figure 7: Spatial process (top) and chip specific effects (bottom) for single chip example (back to text)
14 14 SAIN AND FURRER Figure 8: Gene effects (top) and residuals (bottom) for single chip example (back to text)
15 FITTING LARGE-SCALE SPATIAL MODELS 15 Figure 9: Differences for the chip effects (top) and spatial processes (bottom) in the case of two chips (back to text)
Spatial Backfitting of Roller Measurement Values from a Florida Test Bed
Spatial Backfitting of Roller Measurement Values from a Florida Test Bed Daniel K. Heersink 1, Reinhard Furrer 1, and Mike A. Mooney 2 1 Institute of Mathematics, University of Zurich, CH-8057 Zurich 2
More informationThe Matrix Reloaded: Computations for large spatial data sets
The Matrix Reloaded: Computations for large spatial data sets The spatial model Solving linear systems Matrix multiplication Creating sparsity Doug Nychka National Center for Atmospheric Research Sparsity,
More informationCovariance Tapering for Interpolation of Large Spatial Datasets
Covariance Tapering for Interpolation of Large Spatial Datasets Reinhard Furrer, Marc G. Genton and Douglas Nychka Interpolation of a spatially correlated random process is used in many areas. The best
More informationThe Matrix Reloaded: Computations for large spatial data sets
The Matrix Reloaded: Computations for large spatial data sets Doug Nychka National Center for Atmospheric Research The spatial model Solving linear systems Matrix multiplication Creating sparsity Sparsity,
More informationSpatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields
Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February
More informationCovariance Tapering for Interpolation of Large Spatial Datasets
Covariance Tapering for Interpolation of Large Spatial Datasets Reinhard FURRER, Marc G. GENTON, and Douglas NYCHKA Interpolation of a spatially correlated random process is used in many scientific areas.
More informationSPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA
SPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA D. Pokrajac Center for Information Science and Technology Temple University Philadelphia, Pennsylvania A. Lazarevic Computer
More informationOn Gaussian Process Models for High-Dimensional Geostatistical Datasets
On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May
More informationMultivariate modelling and efficient estimation of Gaussian random fields with application to roller data
Multivariate modelling and efficient estimation of Gaussian random fields with application to roller data Reinhard Furrer, UZH PASI, Búzios, 14-06-25 NZZ.ch Motivation Microarray data: construct alternative
More informationMultivariate spatial models and the multikrig class
Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR ENAR Spring Meetings March 15, 2009 Outline Overview of multivariate spatial regression models Case study: pedotransfer functions
More informationRegression Shrinkage and Selection via the Lasso
Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,
More informationESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS
ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,
More informationBuilding Blocks for Direct Sequential Simulation on Unstructured Grids
Building Blocks for Direct Sequential Simulation on Unstructured Grids Abstract M. J. Pyrcz (mpyrcz@ualberta.ca) and C. V. Deutsch (cdeutsch@ualberta.ca) University of Alberta, Edmonton, Alberta, CANADA
More informationPRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH
PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH SURESH TRIPATHI Geostatistical Society of India Assumptions and Geostatistical Variogram
More informationOutline. Sparse Matrices. Sparse Matrices. Sparse Matrices. Sparse Matrices. Sparse Matrices Methods and Kriging
Sparse Matrices Methods and Kriging Applications to Large Spatial Data Sets SAMSI July 28 August 1, 2009 Reinhard Furrer, University of Zurich Outline What are sparse matrices? How to work with sparse
More informationPARAMETER ESTIMATION FOR FRACTIONAL BROWNIAN SURFACES
Statistica Sinica 2(2002), 863-883 PARAMETER ESTIMATION FOR FRACTIONAL BROWNIAN SURFACES Zhengyuan Zhu and Michael L. Stein University of Chicago Abstract: We study the use of increments to estimate the
More informationA Multivariate Spatial Model for Soil Water Profiles
A Multivariate Spatial Model for Soil Water Profiles Stephan R. Sain, 1 Shrikant Jagtap, 2 Linda Mearns, 3 and Doug Nychka 4 July 20, 2004 SUMMARY: Pedotransfer functions are classes of models used to
More informationMulti-resolution models for large data sets
Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation NORDSTAT, Umeå, June, 2012 Credits Steve Sain, NCAR Tia LeRud, UC Davis
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationGeog 210C Spring 2011 Lab 6. Geostatistics in ArcMap
Geog 210C Spring 2011 Lab 6. Geostatistics in ArcMap Overview In this lab you will think critically about the functionality of spatial interpolation, improve your kriging skills, and learn how to use several
More informationMulti-resolution models for large data sets
Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation Iowa State March, 2013 Credits Steve Sain, Tamra Greasby, NCAR Tia LeRud,
More informationFaster Kriging on Graphs
Faster Kriging on Graphs Omkar Muralidharan Abstract [Xu et al. 2009] introduce a graph prediction method that is accurate but slow. My project investigates faster methods based on theirs that are nearly
More informationClimate Change: the Uncertainty of Certainty
Climate Change: the Uncertainty of Certainty Reinhard Furrer, UZH JSS, Geneva Oct. 30, 2009 Collaboration with: Stephan Sain - NCAR Reto Knutti - ETHZ Claudia Tebaldi - Climate Central Ryan Ford, Doug
More informationI don t have much to say here: data are often sampled this way but we more typically model them in continuous space, or on a graph
Spatial analysis Huge topic! Key references Diggle (point patterns); Cressie (everything); Diggle and Ribeiro (geostatistics); Dormann et al (GLMMs for species presence/abundance); Haining; (Pinheiro and
More informationInteraction effects for continuous predictors in regression modeling
Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonly-used statistical model, and has the advantage
More informationLecture 9: Introduction to Kriging
Lecture 9: Introduction to Kriging Math 586 Beginning remarks Kriging is a commonly used method of interpolation (prediction) for spatial data. The data are a set of observations of some variable(s) of
More informationSome general observations.
Modeling and analyzing data from computer experiments. Some general observations. 1. For simplicity, I assume that all factors (inputs) x1, x2,, xd are quantitative. 2. Because the code always produces
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationA Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression
A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent
More informationSparse Matrices and Large Data Issues
Sparse Matrices and Large Data Issues Workshop ENAR March 15, 2009 Reinhard Furrer, CSM Outline What are sparse matrices? How to work with sparse matrices? Sparse positive definite matrices in statistics.
More informationSparse Matrices and Large Data Issues
Sparse Matrices and Large Data Issues Workshop ENAR March 15, 2009 Reinhard Furrer, CSM Outline What are sparse matrices? How to work with sparse matrices? Sparse positive definite matrices in statistics.
More informationESL Chap3. Some extensions of lasso
ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied
More informationOverview of Spatial Statistics with Applications to fmri
with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationA Bias Correction for the Minimum Error Rate in Cross-validation
A Bias Correction for the Minimum Error Rate in Cross-validation Ryan J. Tibshirani Robert Tibshirani Abstract Tuning parameters in supervised learning problems are often estimated by cross-validation.
More informationIndex. Geostatistics for Environmental Scientists, 2nd Edition R. Webster and M. A. Oliver 2007 John Wiley & Sons, Ltd. ISBN:
Index Akaike information criterion (AIC) 105, 290 analysis of variance 35, 44, 127 132 angular transformation 22 anisotropy 59, 99 affine or geometric 59, 100 101 anisotropy ratio 101 exploring and displaying
More informationOn the convergence of the iterative solution of the likelihood equations
On the convergence of the iterative solution of the likelihood equations R. Moddemeijer University of Groningen, Department of Computing Science, P.O. Box 800, NL-9700 AV Groningen, The Netherlands, e-mail:
More informationExpressions for the covariance matrix of covariance data
Expressions for the covariance matrix of covariance data Torsten Söderström Division of Systems and Control, Department of Information Technology, Uppsala University, P O Box 337, SE-7505 Uppsala, Sweden
More informationComputational methods for mixed models
Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin Madison March 27, 2018 Abstract The lme4 package provides R functions to fit and analyze several different
More informationA full scale, non stationary approach for the kriging of large spatio(-temporal) datasets
A full scale, non stationary approach for the kriging of large spatio(-temporal) datasets Thomas Romary, Nicolas Desassis & Francky Fouedjio Mines ParisTech Centre de Géosciences, Equipe Géostatistique
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationApplications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices
Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.
More informationModel Selection for Geostatistical Models
Model Selection for Geostatistical Models Richard A. Davis Colorado State University http://www.stat.colostate.edu/~rdavis/lectures Joint work with: Jennifer A. Hoeting, Colorado State University Andrew
More informationSpatial Lasso with Application to GIS Model Selection. F. Jay Breidt Colorado State University
Spatial Lasso with Application to GIS Model Selection F. Jay Breidt Colorado State University with Hsin-Cheng Huang, Nan-Jung Hsu, and Dave Theobald September 25 The work reported here was developed under
More informationTesting Theories in Particle Physics Using Maximum Likelihood and Adaptive Bin Allocation
Testing Theories in Particle Physics Using Maximum Likelihood and Adaptive Bin Allocation Bruce Knuteson 1 and Ricardo Vilalta 2 1 Laboratory for Nuclear Science, Massachusetts Institute of Technology
More informationOn Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness
Statistics and Applications {ISSN 2452-7395 (online)} Volume 16 No. 1, 2018 (New Series), pp 289-303 On Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness Snigdhansu
More informationOn dealing with spatially correlated residuals in remote sensing and GIS
On dealing with spatially correlated residuals in remote sensing and GIS Nicholas A. S. Hamm 1, Peter M. Atkinson and Edward J. Milton 3 School of Geography University of Southampton Southampton SO17 3AT
More informationA multi-resolution Gaussian process model for the analysis of large spatial data sets.
National Science Foundation A multi-resolution Gaussian process model for the analysis of large spatial data sets. Doug Nychka Soutir Bandyopadhyay Dorit Hammerling Finn Lindgren Stephen Sain NCAR/TN-504+STR
More informationExploring the World of Ordinary Kriging. Dennis J. J. Walvoort. Wageningen University & Research Center Wageningen, The Netherlands
Exploring the World of Ordinary Kriging Wageningen University & Research Center Wageningen, The Netherlands July 2004 (version 0.2) What is? What is it about? Potential Users a computer program for exploring
More informationStatistical Models for Monitoring and Regulating Ground-level Ozone. Abstract
Statistical Models for Monitoring and Regulating Ground-level Ozone Eric Gilleland 1 and Douglas Nychka 2 Abstract The application of statistical techniques to environmental problems often involves a tradeoff
More informationDefect Detection using Nonparametric Regression
Defect Detection using Nonparametric Regression Siana Halim Industrial Engineering Department-Petra Christian University Siwalankerto 121-131 Surabaya- Indonesia halim@petra.ac.id Abstract: To compare
More informationFast kriging of large data sets with Gaussian Markov random fields
Computational Statistics & Data Analysis 52 (2008) 233 2349 www.elsevier.com/locate/csda Fast kriging of large data sets with Gaussian Markov random fields Linda Hartman a,,, Ola Hössjer b,2 a Centre for
More informationHierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationAdvanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland
EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland Mikhail Kanevski 1, Michel Maignan 1
More informationA Short Note on Resolving Singularity Problems in Covariance Matrices
International Journal of Statistics and Probability; Vol. 1, No. 2; 2012 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education A Short Note on Resolving Singularity Problems
More informationMotivating the Covariance Matrix
Motivating the Covariance Matrix Raúl Rojas Computer Science Department Freie Universität Berlin January 2009 Abstract This note reviews some interesting properties of the covariance matrix and its role
More informationBeta-Binomial Kriging: An Improved Model for Spatial Rates
Available online at www.sciencedirect.com ScienceDirect Procedia Environmental Sciences 27 (2015 ) 30 37 Spatial Statistics 2015: Emerging Patterns - Part 2 Beta-Binomial Kriging: An Improved Model for
More informationarxiv: v1 [stat.me] 24 May 2010
The role of the nugget term in the Gaussian process method Andrey Pepelyshev arxiv:1005.4385v1 [stat.me] 24 May 2010 Abstract The maximum likelihood estimate of the correlation parameter of a Gaussian
More informationThe ProbForecastGOP Package
The ProbForecastGOP Package April 24, 2006 Title Probabilistic Weather Field Forecast using the GOP method Version 1.3 Author Yulia Gel, Adrian E. Raftery, Tilmann Gneiting, Veronica J. Berrocal Description
More informationA Sequential Split-Conquer-Combine Approach for Analysis of Big Spatial Data
A Sequential Split-Conquer-Combine Approach for Analysis of Big Spatial Data Min-ge Xie Department of Statistics & Biostatistics Rutgers, The State University of New Jersey In collaboration with Xuying
More informationRegression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.
Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression
More informationNonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University
Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced
More informationBiostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE
Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective
More informationThe Proportional Effect of Spatial Variables
The Proportional Effect of Spatial Variables J. G. Manchuk, O. Leuangthong and C. V. Deutsch Centre for Computational Geostatistics, Department of Civil and Environmental Engineering University of Alberta
More informationSparse inverse covariance estimation with the lasso
Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty
More informationAn Introduction to Spatial Statistics. Chunfeng Huang Department of Statistics, Indiana University
An Introduction to Spatial Statistics Chunfeng Huang Department of Statistics, Indiana University Microwave Sounding Unit (MSU) Anomalies (Monthly): 1979-2006. Iron Ore (Cressie, 1986) Raw percent data
More informationSpatial Modeling and Prediction of County-Level Employment Growth Data
Spatial Modeling and Prediction of County-Level Employment Growth Data N. Ganesh Abstract For correlated sample survey estimates, a linear model with covariance matrix in which small areas are grouped
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationQuantile Regression for Extraordinarily Large Data
Quantile Regression for Extraordinarily Large Data Shih-Kang Chao Department of Statistics Purdue University November, 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile regression Two-step
More informationHazard Function, Failure Rate, and A Rule of Thumb for Calculating Empirical Hazard Function of Continuous-Time Failure Data
Hazard Function, Failure Rate, and A Rule of Thumb for Calculating Empirical Hazard Function of Continuous-Time Failure Data Feng-feng Li,2, Gang Xie,2, Yong Sun,2, Lin Ma,2 CRC for Infrastructure and
More informationMultivariate spatial models and the multikrig class
Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR SAMSI Summer School on Spatial Statistics July 28 August 1, 2009 Outline Overview of multivariate spatial regression models
More informationModeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study
Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study Gunter Spöck, Hannes Kazianka, Jürgen Pilz Department of Statistics, University of Klagenfurt, Austria hannes.kazianka@uni-klu.ac.at
More informationDouglas Nychka, Soutir Bandyopadhyay, Dorit Hammerling, Finn Lindgren, and Stephan Sain. October 10, 2012
A multi-resolution Gaussian process model for the analysis of large spatial data sets. Douglas Nychka, Soutir Bandyopadhyay, Dorit Hammerling, Finn Lindgren, and Stephan Sain October 10, 2012 Abstract
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More informationApproximating likelihoods for large spatial data sets
Approximating likelihoods for large spatial data sets By Michael Stein, Zhiyi Chi, and Leah Welty Jessi Cisewski April 28, 2009 1 Introduction Given a set of spatial data, often the desire is to estimate
More informationAnalysis of methods for speech signals quantization
INFOTEH-JAHORINA Vol. 14, March 2015. Analysis of methods for speech signals quantization Stefan Stojkov Mihajlo Pupin Institute, University of Belgrade Belgrade, Serbia e-mail: stefan.stojkov@pupin.rs
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationBasics of Point-Referenced Data Models
Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic
More informationPackage plw. R topics documented: May 7, Type Package
Type Package Package plw May 7, 2018 Title Probe level Locally moderated Weighted t-tests. Version 1.40.0 Date 2009-07-22 Author Magnus Astrand Maintainer Magnus Astrand
More informationEstimation of direction of increase of gold mineralisation using pair-copulas
22nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 2017 mssanz.org.au/modsim2017 Estimation of direction of increase of gold mineralisation using pair-copulas
More informationObtaining Uncertainty Measures on Slope and Intercept
Obtaining Uncertainty Measures on Slope and Intercept of a Least Squares Fit with Excel s LINEST Faith A. Morrison Professor of Chemical Engineering Michigan Technological University, Houghton, MI 39931
More information7 Geostatistics. Figure 7.1 Focus of geostatistics
7 Geostatistics 7.1 Introduction Geostatistics is the part of statistics that is concerned with geo-referenced data, i.e. data that are linked to spatial coordinates. To describe the spatial variation
More informationComparing Non-informative Priors for Estimation and Prediction in Spatial Models
Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial
More informationSequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process
Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University
More informationPOPULAR CARTOGRAPHIC AREAL INTERPOLATION METHODS VIEWED FROM A GEOSTATISTICAL PERSPECTIVE
CO-282 POPULAR CARTOGRAPHIC AREAL INTERPOLATION METHODS VIEWED FROM A GEOSTATISTICAL PERSPECTIVE KYRIAKIDIS P. University of California Santa Barbara, MYTILENE, GREECE ABSTRACT Cartographic areal interpolation
More informationRegression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)
Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features
More informationProbabilistic Regression Using Basis Function Models
Probabilistic Regression Using Basis Function Models Gregory Z. Grudic Department of Computer Science University of Colorado, Boulder grudic@cs.colorado.edu Abstract Our goal is to accurately estimate
More informationComputationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor
Computationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor Y. Wang M. J. Daniels wang.yanpin@scrippshealth.org mjdaniels@austin.utexas.edu
More informationRegression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)
Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features
More informationMixed models in R using the lme4 package Part 4: Theory of linear mixed models
Mixed models in R using the lme4 package Part 4: Theory of linear mixed models Douglas Bates 8 th International Amsterdam Conference on Multilevel Analysis 2011-03-16 Douglas Bates
More informationNotes for CS542G (Iterative Solvers for Linear Systems)
Notes for CS542G (Iterative Solvers for Linear Systems) Robert Bridson November 20, 2007 1 The Basics We re now looking at efficient ways to solve the linear system of equations Ax = b where in this course,
More informationChris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010
Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationChapter 4 - Fundamentals of spatial processes Lecture notes
TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites
More informationLatin Hypercube Sampling with Multidimensional Uniformity
Latin Hypercube Sampling with Multidimensional Uniformity Jared L. Deutsch and Clayton V. Deutsch Complex geostatistical models can only be realized a limited number of times due to large computational
More informationOn the smallest eigenvalues of covariance matrices of multivariate spatial processes
On the smallest eigenvalues of covariance matrices of multivariate spatial processes François Bachoc, Reinhard Furrer Toulouse Mathematics Institute, University Paul Sabatier, France Institute of Mathematics
More informationAn Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong
More informationExploratory quantile regression with many covariates: An application to adverse birth outcomes
Exploratory quantile regression with many covariates: An application to adverse birth outcomes June 3, 2011 eappendix 30 Percent of Total 20 10 0 0 1000 2000 3000 4000 5000 Birth weights efigure 1: Histogram
More informationLinear System of Equations
Linear System of Equations Linear systems are perhaps the most widely applied numerical procedures when real-world situation are to be simulated. Example: computing the forces in a TRUSS. F F 5. 77F F.
More information