The OK weights define the best linear unbiased predictor (BLUP). The OK prediction, z ( x ), is defined as: (2) given.

Similar documents
Simple Linear Regression

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Functions of Random Variables

Sequential Approach to Covariance Correction for P-Field Simulation

Chapter 14 Logistic Regression Models

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

ENGI 3423 Simple Linear Regression Page 12-01

Linear Regression with One Regressor

Econometric Methods. Review of Estimation

Lecture 8: Linear Regression

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multiple Linear Regression Analysis

4. Standard Regression Model and Spatial Dependence Tests

STK4011 and STK9011 Autumn 2016

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Analysis of Variance with Weibull Data

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Special Instructions / Useful Data

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

COV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.

Statistics MINITAB - Lab 5

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

Probability and. Lecture 13: and Correlation

TESTS BASED ON MAXIMUM LIKELIHOOD

Chapter 5 Properties of a Random Sample

Lecture Notes Types of economic variables

Objectives of Multiple Regression

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

ESS Line Fitting

Chapter 8: Statistical Analysis of Simulated Data

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Point Estimation: definition of estimators


Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

A New Family of Transformations for Lifetime Data

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

ρ < 1 be five real numbers. The

Simple Linear Regression

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

ENGI 4421 Propagation of Error Page 8-01

LINEAR REGRESSION ANALYSIS

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

6.867 Machine Learning

Chapter 13 Student Lecture Notes 13-1

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Lecture 3 Probability review (cont d)

Simulation Output Analysis

Maximum Likelihood Estimation

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Multivariate Probability Estimation for Categorical Variables from Marginal Distribution Constraints

L5 Polynomial / Spline Curves

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

residual. (Note that usually in descriptions of regression analysis, upper-case

Lecture 3. Sampling, sampling distributions, and parameter estimation

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

3. Basic Concepts: Consequences and Properties

Chapter 4 Multiple Random Variables

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Summary of the lecture in Biostatistics

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1

13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations

STA302/1001-Fall 2008 Midterm Test October 21, 2008

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

9.1 Introduction to the probit and logit models

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

Chapter Two. An Introduction to Regression ( )

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

CHAPTER VI Statistical Analysis of Experimental Data

Chapter 8. Inferences about More Than Two Population Central Values

Lecture 1 Review of Fundamental Statistical Concepts

Chapter 11 The Analysis of Variance

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

Statistics: Unlocking the Power of Data Lock 5

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

STRONG CONSISTENCY FOR SIMPLE LINEAR EV MODEL WITH v/ -MIXING

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Comparing Different Estimators of three Parameters for Transmuted Weibull Distribution

Testing for the Multivariate Gaussian Distribution of Spatially Correlated Data

Continuous Distributions

X ε ) = 0, or equivalently, lim

Taylor s Series and Interpolation. Interpolation & Curve-fitting. CIS Interpolation. Basic Scenario. Taylor Series interpolates at a specific

A Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies

Lecture 2: Linear Least Squares Regression

Confidence Intervals for Double Exponential Distribution: A Simulation Approach

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

PGE 310: Formulation and Solution in Geosystems Engineering. Dr. Balhoff. Interpolation

CHAPTER 3 POSTERIOR DISTRIBUTIONS

To use adaptive cluster sampling we must first make some definitions of the sampling universe:

Estimation of Population Total using Local Polynomial Regression with Two Auxiliary Variables

MEASURES OF DISPERSION

Analysis of Lagrange Interpolation Formula

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

STK3100 and STK4100 Autumn 2018

Module 7. Lecture 7: Statistical parameter estimation

Transcription:

JSR data archve materal for HOW POROSITY AND PERMEABILITY VARY SPATIALLY WITH GRAIN SIZE, SORTING, CEMENT VOLUME AND MINERAL DISSOLUTION IN FLUVIAL TRIASSIC SANDSTONES: THE VALUE OF GEOSTATISTICS AND LOCAL REGRESSION. J.M. MCKINLEY, P.M. ATKINSON, C. D. LLOYD, A.H. RUFFELL ad R.H. WORDEN Geostatstcal Techques I geostatstcs, spatal varato s modelled as comprsg two dstct parts, a determstc compoet μ(x) ad a stochastc (or radom ) compoet R(x) where x represets locato. Ths s termed a radom fucto (RF) model. The upper case Z refers to the RF whereas lower case z refers to a data realsato of the RF model, called a regoalsed varable (ReV). I geostatstcs, a spatally-refereced varable z(x) s treated as a outcome of a RF Z(x). The Theory of Regoalsed Varables (Mathero, 1971) s the fudametal framework o whch geostatstcs s based. I a classcal framework, geostatstcal aalyss usually volves two stages: () estmato of the varogram ad fttg a model to t ad () use of the varogram model coeffcets for spatal predcto (krgg) or smulato. Varography The varogram s a core tool geostatstcal aalyss ad s requred for geostatstcal spatal predcto. The varogram characterses spatal depedece the property of terest such as permeablty or porosty. I smple terms, the varogram s estmated by calculatg half the average squared dfferece betwee all the avalable pared measuremets separated by a gve lag tolerace, where lag h s a separato vector (dstace ad drecto). The expermetal varogram, ˆ γ ( h), ca be estmated from p(h) pared observatos, z(x ), z(x + h), = 1, 2, p(h) usg: p( h) 1 2 ˆ( γ h) = { z( x ) z( x + h) } (1) 2 p( h) = 1 A mathematcal model may be ftted to the expermetal varogram ad the coeffcets of ths model ca be used for a rage of geostatstcal operatos such as spatal predcto (krgg) ad

codtoal smulato. A model s usually selected from oe of a set of authorsed models. McBratey ad Webster (1986) provde a revew of some of the most wdely used authorsed models. The varous parameters of the models ftted to expermetal varograms ca be related to structural features a rock face or successo (McKley et al. 24). For example, bouded or trastve types of model, the rage parameter provdes formato about the maxmum scale of spatal varato. Krgg Krgg s a smoothg terpolator sce krgg predctos are weghted movg averages of the avalable data. The most wdely used varat of krgg, termed ordary krgg () s used ths study. allows the mea to vary spatally: the mea s estmated for each predcto eghbourhood from the avalable data. ˆ The weghts defe the best lear ubased predctor (BLUP). The predcto, z ( x ), s defed as: zˆ ( x ) = λ z( x ) (2) = 1 gve = 1,... avalable data z ( x ) esure ubased predcto: ˆ, wth the costrat that the weghts, λ, sum to 1 to = 1 λ = 1 (3) The krgg predcto error must have a expected value of : E Zˆ ( x ) Z( x )} (4) { = 2 The krgg (or predcto) varace, σ, s expressed as:

ˆ σ 2 ( x ) = [{ ˆ ( ) ( )} 2 E Z x Z x ] (5) = 2 λ γ ( x x ) λ = 1 = 1 = 1 λ γ ( x x ) That s, we seek the values of λ 1,, λ (the weghts) that mmse ths expresso wth the costrat that the weghts sum to oe (equato 3). Ths mmsato s acheved through Lagrage Multplers. The codtos for the mmsato are gve by the system comprsg + 1 equatos ad + 1 ukows: = 1 = 1 λ λ γ ( x = 1 x ) + ψ = γ ( x x ) = 1,..., (6) where γ x x ) represets the semvarace betwee the avalable data ad themselves, ( γ ( x x ) represets the semvarace betwee the avalable data ad the predcto locato x, ad ψ s a Lagrage multpler. Kowg ψ, the predcto varace of ca be gve as: 2 ˆ σ = ψ + λ γ ( x x ) = 1 (7) The krgg varace s a measure of cofdece predctos ad s a fucto of the form of the varogram, the sample cofgurato ad the sample support (the area over whch a observato s made, whch may be approxmated as a pot or may be a area; Jourel ad Hubregts, 1978). The krgg varace s ot codtoal o the data values locally ad ths has led some researchers to use alteratve approaches such as codtoal smulato (dscussed the ext secto) to buld models of spatal ucertaty (Goovaerts, 1997).

Codtoal smulato As stated prevously, krgg predctos are weghted movg averages of the avalable sample data. Krgg s, therefore, a smoothg terpolator. Codtoal smulato (also called stochastc magg) s ot subject to the smoothg assocated wth krgg (coceptually, the varato lost by krgg due to smoothg s added back) as predctos are draw from equally probable jot realsatos of the Radom Varables (RVs) whch make up a RF model (Deutsch ad Jourel, 1998). That s, smulated values are ot the expected values (.e., the mea), but are values draw radomly from the codtoal cumulatve dstrbuto fucto (ccdf): a fucto of the avalable observatos ad the modelled spatal varato (Duga, 1999). Smulated realsatos represet a possble realty whereas krgg predctos do ot. Therefore SGS realsatos ca be used to represet possble varato for propertes of a rock face or successo whereas krgg produces a smoothed terpolato of the data. The smulato s cosdered codtoal f the smulated values hoour the observatos at ther locatos (Deutsch ad Jourel, 1998). Smulato allows the geerato of may dfferet possble realsatos whch ecapsulate the ucertaty spatal predcto (Jourel, 1996) ad thus may be used as a gude to potetal errors the charactersato of varato rock propertes (Jourel, 1996). Probably the most wdely used form of codtoal smulato s sequetal Gaussa smulato (SGS) as used ths study. Wth sequetal smulato, smulated values are codtoal o the orgal data ad prevously smulated values (Deutsch ad Jourel, 1998). I SGS the ccdfs are all assumed to be Gaussa. The SGS algorthm follows several steps (Goovaerts, 1997; Deutsch, 22) as detaled below: 1. Apply a stadard ormal trasform to the data. 2. Go to the locato x 1. 3. Use SK (ote s ofte used stead; see Deutsch ad Jourel 1998 about ths ssue), codtoal o the orgal data, z x ), to make a predcto. The SK predcto ad the krgg varace are parameters (the mea ad varace) of a Gaussa ccdf: ( F x ; z ( ) = Prob{ Z( x ) z ( )} (8) ( 1 1

l 4. Usg Mote Carlo smulato, draw a radom resdual, z x ), from the ccdf. 5. Add the SK predcto ad the resdual to gve the smulated value; the smulated value s added to the data set. 6. Vst all locatos radom order ad predct usg SK codtoal o the orgal data ad l the -1 values, z ( x ), smulated at the prevously vsted locatos x j, j=1,, to model the ccdf: ( 1 F x ; z ( + 1) = Prob{ Z( x ) z ( + 1)} (9) ( 1 1 7. Follow the procedure steps 4 ad 5 utl all locatos have bee vsted (that s, utl realsatos l = 1,, L have bee obtaed). 8. Back trasform the data values ad smulated values. By usg dfferet radom umber seeds, the order of vstg locatos s vared ad multple realsatos ca be obtaed. I other words, sce the smulated values are added to the data set, the values avalable for use smulato are partly depedet o the locatos at whch smulatos have already bee made ad, because of ths, the values smulated at ay oe locato vary as the avalable data vary. SGS s dscussed detal several texts (e.g., Goovaerts, 1997; Deutsch ad Jourel, 1998; Chlès ad Delfer, 1999; Deutsch, 22). GEOGRAPHICALLY WEIGHTED REGRESSION (GWR) Ths secto descrbes the basc global lear regresso model ad ts exteso to geographcally weghted regresso (GWR) (local regresso wth spatally varyg parameters). The basc lear regresso model s: Lear Regresso y m 1 = + k xk + ε (1) = k 1

predctg the target varable y at locato as a fucto of m parameters ad k (k=1,, m- 1) ad m-1 explaatory varables x k. The model cludes a depedet ormally dstrbuted error term wth zero mea ε. Usually, the least squares method, based o matrx algebra, s used to estmate the k s. Ths s readly expressed matrx otato as; t 1 t ( x x) x y ˆ = (11) whereˆ s a sgle colum vector of estmated parameters (.e., regresso coeffcets), a frst colum of 1s ad the m-1 explaatory varables fll the m colums of x, the target varable flls the sgle colum of y ad t dcates the traspose of a matrx. The sgle colum of 1s the matrx x allows estmato of. Equato 1 s a statoary model of the relatos betwee y ad x, that s, a model whch the parameters of the model are costat rrespectve of geographcal locato. Thus, the model s approprate oly where the relatos may reasoably be modelled as varat over space. Where the relatos are expected or kow to chage wth geographcal locato t s ecessary to apply a spatally o-statoary model: oe that allows varato the parameters of the model as oe moves from place to place. No-statoary regresso modellg s ot ew, but Brusdo et al. (1999) developed the approach ad termed t geographcally weghted regresso (GWR). GWR offers a bass upo whch to address some of the lmtatos of covetoal global regresso models descrbed the troducto. Geographcally Weghted Regresso GWR exteds the tradtoal regresso model of Equato 1 to the o-statoary case by allowg the coeffcets to vary wth geographcal locato : y m = + 1 k = 1 x + ε (12) k k

where k s the value of the k th parameter at locato. Equato 1 s a specal case of the more geeral Equato 12, where the parameters do ot vary across space. A potetal problem wth Equato 5 s that there s oly oe observato at each for the target varable y ad the m-1 explaatory varables x k. To provde suffcet observatos wth whch to ft a local regresso model for each, Brusdo et al. (1999) cosdered the relato betwee y ad x wth a movg wdow aroud, whle ackowledgg that such a approach volves approxmato. The weghted least squares (WLS) approach to fttg regresso models provdes a meas by whch to vary the fluece of dvdual observatos o the ftted model. Specfcally, WLS a weght w s appled to each squared dfferece before mmsato. The weghts are usually chose to versely reflect some measure of ucertaty so that more ucerta observatos are assged less weght. If w s the dagoal matrx of the w s the Equato 11 ca be exteded readly as: t 1 t ( x wx) x wy ~ = (13) Sce the dagoal weghts matrx w vares wth locato j, such that a dfferet calbrato exsts for each j, t s helpful to wrte Equato 6 more strctly as: t 1 t ( x w() j x) x w()y j ~ = (14) Locato has ow bee refereced by j rather tha because the locato at whch a model s ftted eed ot equal the locato of the avalable data. Brusdo et al. (1999) cosder a rage of possble weghtg fuctos for use Equato 7. These clude a smple step fucto:

w w = 1 = f d otherwse < d j (15) where d s the dstace betwee the locato of a observato used model fttg ad the locato j at whch the model s ftted. Ths weghtg fucto s fast because the umber of data used the regresso model s lmted. However, ts dscotuous ature ca result artefacts (spatal dscotutes) the regresso coeffcets. A alteratve s to specfy w as a cotuous fucto of d. The expoetal fucto s a commo choce: d w = exp (16) a where, a s a o-lear parameter. As a compromse betwee the computatoal savg of Equato 15 ad the cotuty of Equato 16, Brusdo et al. (1999) suggest usg the bsquare fucto: w w = 2 2 [ 1 d d ] = 2 f d otherwse < d j (17) The essetal dea of GWR, whatever weghtg fucto s chose, s to gve more weght to observatos close to the locato j at whch the regresso model s desred tha to those observatos that are further away. I ths sese, the method has some parallels wth geostatstcal techques for local predcto, as descrbed above. Refereces Brusdo, C., Atk, M., Fothergham, S., ad Charlto, M.E., 1999, A comparso of radom coeffcet modellg ad geographcally weghted regresso for spatally o-statoary regresso problems: Geographcal ad Evrometal Modellg, v. 3, p. 47-62. Chlès, J.P., ad Delfer, P., 1999, Geostatstcs: Modelg Spatal Ucertaty: New York, Joh Wley ad Sos, 695 p.

Deutsch, C.V., 22, Geostatstcal Reservor Modellg: New York, Oxford Uversty Press, 376 p Deutsch, C.V., ad Jourel, A.G., 1998, GSLIB: Geostatstcal Software Lbrary ad User s Gude, Secod Edto: New York, Oxford Uversty Press, 369 p. Duga, J.L., 1999, Codtoal smulato, Ste A., va der Meer, F., ad Gorte B., eds., Spatal Statstcs for Remote Sesg, Ste: Dordrecht, Kluwer Academc Publshers, p. 135 152. Goovaerts, P., 1997, Geostatstcs for Natural Resources Evaluato: New York, Oxford Uversty Press, 483 p. Jourel, A.G., 1996, Modellg ucertaty ad spatal depedece: Stochastc magg. Iteratoal Joural of Geographcal Iformato Systems, v. 1, p. 517-522. Jourel, A.G., ad Hubregts, C.J., 1978, Mg Geostatstcs, Orlado: Academc Press,6p. Mathero, G., 1971, The Theory of Regoalsed Varables ad ts Applcatos:Fotaebleau, Cetre de Morphologe Mathématque de Fotaebleau, 211 p. McBratey A.B., Webster R., 1986, Choosg fuctos for sem-varograms of sol propertes ad fttg them to samplg estmates: Joural of Sol Scece, v. 37, p. 617 639. McKley, J.M., Lloyd, C.D., ad Ruffell, A.H., 24, Use of varography permeablty charactersato of vsually homogeeous sadstoe reservors wth examples from outcrop studes: Mathematcal Geology, v. 36, p. 761-779.