4 Multidimensional quantitative data

Size: px
Start display at page:

Download "4 Multidimensional quantitative data"

Transcription

1 Chapter 4 Multidimesioal quatitative data 4 Multidimesioal statistics Basic statistics are ow part of the curriculum of most ecologists However, statistical techiques based o such simple distributios as the uidimesioal ormal distributio are ot really appropriate for aalysig complex ecological data sets Nevertheless, researchers sometimes perform series of simple aalyses o the various descriptors i the data set, expectig to obtai results that are pertiet to the problem uder study This type of approach is icorrect because it does ot take ito accout the covariace amog descriptors; see also Box 3 where the statistical problem created by multiple testig is explaied I additio, such a approach oly extracts miimum iformatio from data which have ofte bee collected at great cost ad it usually geerates a mass of results from which it is difficult to draw much sese Fially, i studies ivolvig species assemblages, it is usually more iterestig to describe the variability of the structure of the assemblage as a whole (ie mesurative variatio observed through space or time, or maipulative variatio resultig from experimetal maipulatio; Hurlbert, 984) tha to look at each species idepedetly Fortuately, methods derived from multidimesioal statistics, which are used throughout this book, are desiged for aalysig complex data sets These methods take ito accout the co-varyig ature of ecological data ad ca evidece the structures that uderlie the data The preset chapter discusses the basic theory ad characteristics of multidimesioal data aalysis Mathematics are kept to a miimum, so that readers ca easily reach a high level of uderstadig May approaches of practical iterest are discussed, icludig several types of liear correlatio, with their statistical tests It must be oted that this chapter is limited to liear statistics A umber of excellet textbooks deal with detailed aspects of multidimesioal statistics For example, formal presetatios of the subject are foud i Muirhead (98) ad Aderso (984) Researchers less iterested i mathematical theory may refer to Cooley & Lohes (97), Tatsuoka (97), Press (97), Graybill (983), or

2 3 Multidimesioal quatitative data Table 4 Numerical example of two species observed at four samplig sites Figure 4 shows that each row of the data matrix may be costrued as a vector, as defied i Sectio 4 Samplig sites Species (descriptors) (objects) (p ) ( 4) Multidimesioal Multivariate Morriso (99) These books describe a umber of useful methods, amog which the multidimesioal aalysis of variace However, oe of these books specifically deals with ecological data Several authors use the term multivariate as a abbreviatio for multidimesioal variate (the latter term meaig radom variable; Sectio ) As a adjective, multivariate is iterchageable with multidimesioal 4 Multidimesioal variables ad dispersio matrix As stated i Sectio, the preset textbook deals with the aalysis of radom variables Ecological data matrices have rows ad p colums (Sectio ) Each row is a vector (Sectio 4) which is, statistically speakig, oe realizatio of a p-dimesioal radom variable I other words, for example, whe p species are observed at samplig sites, the species are the p dimesios of a radom variable species ad each site is oe realizatio of this p-dimesioal radom variable To illustrate this cocept, four samplig uits with two species (Table 4) are plotted i a two-dimesioal Euclidea space (Fig 4) Vector site is the doublet (5,) It is plotted i the same two-dimesioal space as the three other vectors site i Each row of the data matrix is a two-dimesioal vector, which is oe realizatio of the (bivariate) radom variable species The radom variable species is said to be two-dimesioal because the samplig uits (objects) cotai two species (descriptors), the two dimesios beig species ad, respectively

3 Multidimesioal variables ad dispersio matrix Site 4 (6,4) Species 3 Site (3,) Site 3 (8,3) Site (5,) Species Figure 4 Four realizatios (samplig sites from Table 4) of the two-dimesioal radom variable species are plotted i a two-dimesioal Euclidea space As the umber of descriptors (eg species) icreases, the umber of dimesios of the radom variable species similarly icreases, so that more axes are ecessary to costruct the space i which the objects are plotted Thus, the p descriptors make up a p-dimesioal radom variable ad the vectors of observatios are as may realizatios of the p-dimesioal vector descriptors The preset chapter does ot deal with samples of observatios, which result from field or laboratory work (for a brief discussio o samplig, see Sectio ), but it focuses istead o populatios, which are ivestigated by meas of the samples Before approachig the multidimesioal ormal distributio, it is ecessary to defie a p-dimesioal radom variable descriptors : Y [y, y, y j, y p ] (4) Each elemet y j of the multidimesioal table Y is a uidimesioal radom variable Every descriptor y j is observed i each of the vectors object, each samplig uit i providig oe realizatio of the p-dimesioal radom variable (Fig 4) I ecology, the structure of depedece amog descriptors is, i may istaces, the matter beig ivestigated Researchers who study multidimesioal data sets usig uivariate statistics assume that the p uidimesioal y j variables i Y are idepedet of oe aother (this refers to the third meaig of idepedece i Box ) This is the reaso why uivariate statistical methods are iappropriate with most ecological data ad why methods that take ito accout the depedece amog descriptors must be used whe aalysig sets of multidimesioal data Oly these methods will geerate proper results whe there is depedece amog descriptors; it is ever acceptable to replace a multidimesioal aalysis by a series of uidimesioal treatmets

4 34 Multidimesioal quatitative data i j p O bj e c ts Descriptors y ij y ip y j y p Dispersio (covariace) matrix D es c ri p t o r s Descriptors s s j s p s j s jj s jp s p s pj s pp s j k y ij y i k y k i Figure 4 Structure of ecological data Give their ature, ecological descriptors are depedet of oe aother I statistics, the objects are ofte assumed to be idepedet observatios, but this is geerally ot the case i ecology (Sectio ) The usual tests of sigificace require, however, that successive sample observatio vectors from the multidimesioal populatio have bee draw i such a way that they ca be costrued as realizatios of idepedet radom vectors (Morriso, 99, p 8) Sectio has show that this assumptio of idepedece amog observatios is most ofte ot realistic i ecology Lack of idepedece amog the observatios (data rows) does ot really matter whe statistical models are used for descriptive purposes oly, as it is ofte the case i the preset book For statistical testig, however, corrected tests of sigificace have to be used whe the observatios are autocorrelated (Sectio ) To sum up: () the p descriptors i ecological data matrices are the p dimesios of a radom variable descriptors ; () i geeral, the p descriptors are ot liearly idepedet of oe aother; methods of multidimesioal aalysis are desiged to brig out the structure of liear depedece amog descriptors; (3) each of the samplig uits is a realizatio of the p-dimesioal vector descriptors ; (4) the usual tests of sigificace assume that the samplig uits are realizatios of idepedet radom vectors The latter coditio is geerally ot met i ecology, with cosequeces that were metioed i the previous paragraph ad discussed i Sectio For the various meaigs of the term idepedece i statistics, see Box

5 Multidimesioal variables ad dispersio matrix 35 Dispersio matrix The depedece amog quatitative variables y j brigs up the cocept of covariace Covariace is the extesio, to two descriptors, of the cocept of variace Variace is a measure of the dispersio of a radom variable y j aroud its mea; it is deoted j Covariace measures the joit dispersio of two radom variables y j ad y k aroud their meas; it is deoted jk The dispersio matrix of Y, called matrix (sigma), cotais the variaces ad covariaces of the p descriptors (Fig 4): p p (4) p p pp Matrix is a associatio matrix [descriptors descriptors] (Sectio ) The elemets jk of matrix are the covariaces betwee all pairs of the p radom variables The matrix is symmetric because the covariace of y j ad y k is idetical to that of y k ad y j A diagoal elemet of is the covariace of a descriptor y j with itself, which is the variace of y j, so that j j j Variace The estimate of the variace of y j, deoted s j, is computed o the cetred variable y i j Variable y j is cetred by subtractig the mea y j from each of the observatios y ij As a result, the mea of the cetred variable is zero The ubiased estimator of the populatio variace s j is computed usig the well-kow formula: s j y i j i (43) Covariace where the sum of squares of the cetred data, for descriptor j, is divided by the umber of objects mius oe ( ) The summatio is over the observatios of descriptor j I the same way, the estimate s jk ) of the covariace jk ) of y j ad y k is computed o the cetred variables y ij ad y ik y k, usig the formula of a bivariate variace The covariace s jk is calculated as: s j k y ij y i k y k i (44) Stadard deviatio Whe k j, eq 44 is idetical to eq 43 The positive square root of the variace is called the stadard deviatio ( j ) Its estimate s j is thus: s j s j (45)

6 36 Multidimesioal quatitative data Table 4 Symbols used to idetify (populatio) parameters ad (sample) statistics Parameter Statistic Matrix or vector Elemets Matrix or vector Elemets Covariace (sigma) jk (sigma) S s jk Correlatio (rho) jk (rho) R r jk Mea (mu) µ j (mu) y y j The symbols for matrix ad summatio should ot be cofused Coefficiet of variatio The coefficiet of variatio is a dimesioless measure of variatio CV is used to compare the variatio of variables expressed i differet physical uits It is obtaied by dividig the stadard deviatio s j by the mea of variable j: x j CV j s j / Sice the stadard deviatio ad the mea of a variable have the same physical uits, CV j is dimesioless CV j is oly defied for quatitative variables that have o-zero meas ad it does ot makes sese for iterval-scale variables (Subsectio 4), for which the value of the mea is arbitrary The coefficiet of variatio may be rescaled to percetages by multiplyig its value by For small, a estimate with reduced bias is obtaied by multiplyig CV by ( + /(4)) Cotrary to the variace, which is always positive, the covariace may take positive or egative values To uderstad the meaig of the covariace, imagie that the object poits are plotted i a scatter diagram where the axes are descriptors y j ad y k The data are cetred by drawig ew axes, whose origi is at the cetroid ( y j, y k ) of the cloud of poits (cetred plots of that kid with positive ad egative correlatios are show i Fig 47) A positive covariace meas that most of the poits are i quadrats I ad III of the cetred plot, where the cetred values y ij ad y i k y k have the same sigs This correspods to a positive relatioship betwee the two descriptors The coverse is true for a egative covariace, for which most of the poits are i quadrats II ad IV of the cetred plot Whe the covariace is ull or small, the poits are equally distributed amog the four quadrats of the cetred plot x j Parameter Statistic Greek ad roma letters are used here (Table 4) The properties of a populatio (called parameters) are deoted by greek letters Their estimates (called statistics), computed from samples, are symbolized by the correspodig roma letters These covetios are complemeted by those pertaiig to matrix otatio (Sectio )

7 Multidimesioal variables ad dispersio matrix 37 The dispersio matrix * S ca be computed directly, by multiplyig the matrix of cetred data y y with its traspose y y ' : S (46) y y ' y y y y y y y y y y y y y p y p S y y y y y y y y y y y p y p y p y p y p y p y p y p y y y y y p y p S i i i y i y y i y y i y y ip i y i y y i y i y i y y i y y ip y p i y i y y ip y p i y p y i y y i p y p y i y y i p y p i i This elegat ad rapid procedure shows oce agai the advatage of matrix algebra i umerical ecology, where the data sets are geerally large Numerical example Four species (p 4) were observed at five statios ( 5) The estimated populatio parameters, for the species, are the meas ( y j ), the variaces ( s j ), ad the covariaces (s jk ) The origial ad cetred data are show i Table 43 Because s jk s kj, the dispersio matrix is symmetric The mea of each cetred variable is zero I this umerical example, the covariace betwee species ad the other three species is zero This does ot ecessarily mea that species is idepedet of the other three, but simply that the joit liear dispersio of species with ay oe of the other three is zero This example will be revisited i Sectio 4 * Some authors call y y ' y y a dispersio matrix ad S a covariace matrix For these authors, a covariace matrix is the a dispersio matrix divided by ( )

8 38 Multidimesioal quatitative data Table 43 Numerical example Calculatio of cetred data ad covariaces Sites Origial data Cetred data Y y y 4 4 Meas y' y y ' 4 S y y ' y y The square root of the determiat of the dispersio matrix S is kow as the geeralized variace It is also equal to the square root of the product of the eigevalues of S Ay dispersio matrix S is positive semidefiite (Table ) Ideed, the quadratic form of S (p p) with ay real ad o-ull vector t (of size p) is: t'st This expressio may be expaded usig eq 46: t's t t' y y ' y y t t's t a scalar y y t ' y y t This scalar is the variace of the variable resultig from the product Yt Sice a variace, which is a sum of squared values, ca oly be positive or ull, it follows that: so that S is positive semidefiite t'st

9 Correlatio matrix 39 A importat property ca be derived by computig the quadratic form of the dispersio matrix S usig eq8 (right) I that case, U U' because S is symmetric (property #7 of iverses, Sectio 8), ad eq8 (right) becomes: U'SU As vector t i the quadratic form, use the successive eigevectors u j from matrix U For each vector u j, the developmet above shows that u' j Su j Sice u' j Su j j, this demostrates that all the eigevalues j of S are positive or ull This property of dispersio matrices is fudametal i umerical ecology: it allows oe to partitio the variace amog real pricipal axes (Sectios 44 ad 9) Ideally, matrix S (of order p) should be estimated from a umber of observatios larger tha the umber of descriptors p Whe p, the rak of matrix S is ad, cosequetly, oly of its rows or colums are idepedet, so that p ( ) ull eigevalues are produced The oly practical cosequece of p is thus the presece of ull eigevalues i the pricipal compoet solutio (Sectio 9) The first few eigevalues of S, which are geerally those of iterest, have positive eigevalues 4 Correlatio matrix The previous sectio has show that the covariace provides iformatio o the orietatio of the cloud of data poits i the space defied by the descriptors That statistic, however, does ot provide ay iformatio o the itesity of the relatioship betwee variables y j ad y k Ideed, the covariace may icrease or decrease without chagig the relatioship betwee y j ad y k For example, i Fig 43, the two clouds of poits correspod to differet covariaces (factor two i size, ad thus i covariace), but the relatioship betwee variables is idetical (same shape) Sice the covariace depeds o the dispersio of poits aroud the mea of each variable (ie their variaces), determiig the itesity of the relatioship betwee variables requires to cotrol for the variaces The covariace measures the joit dispersio of two radom variables aroud their meas The correlatio is defied as a measure of the depedece betwee two radom variables y j ad y k As already explaied i Sectio 5, it ofte happes that matrices of ecological data cotai descriptors with o commo scale, eg whe some species are more abudat tha others by orders of magitude, or whe the descriptors have differet physical dimesios (Chapter 3) Calculatig covariaces o such variables obviously does ot make sese, except if the descriptors are first reduced to a commo scale The procedure cosists i cetrig all descriptors o a zero mea ad reducig them to uit stadard deviatio (eq) By usig stadardized descriptors,

10 4 Multidimesioal quatitative data y k B A y j Figure 43 Several observatios (objects), with descriptors y j ad y k, were made uder two differet sets of coditios (A ad B) The two ellipses delieate clouds of poit-objects correspodig to A ad B, respectively The covariace of y j ad y k is twice as large for B as it is for A (larger ellipse), but the correlatio betwee the two descriptors is the same i these two cases (ie the ellipses have the same shape) it is possible to calculate meaigful covariaces, because the ew variables have the same scale (ie uit stadard deviatio) ad are dimesioless (see Chapter 3) Liear correlatio The covariace of two stadardized descriptors is called the coefficiet of liear correlatio (Pearso r) This statistic has bee proposed by the statisticia Karl Pearso ad is amed after him Give two stadardized descriptors (eq) y z ij y i j ad z ik y k i k s j calculatig their covariace (eq 44) gives s( z j, z k ) s k z i j z i k because z j z k i s( z j, z k ) i y i j y j s j y i k y k s k s( z j, z k ) s j s k y y y ij j i k y k i

11 Correlatio matrix 4 s( z j, z k ) s, the coefficiet of liear correlatio betwee y j ad y k s j s jk r jk k The developed formula is: r jk s jk s j s k i y ij y ij y i k y k y j y i k y k (47) i i Correlatio matrix As i the case of dispersio (Sectio 4), it is possible to costruct the correlatio matrix of Y, ie the (rho) matrix, whose elemets are the coefficiets of liear correlatio jk: p p (48) p p The correlatio matrix is the dispersio matrix of the stadardized variables This cocept will play a fudametal role i pricipal compoet aalysis (Sectio 9) It should be oted that the diagoal elemets of are all equal to This is because the compariso of ay descriptor with itself is a case of complete depedece, which leads to a correlatio j Whe y j ad y k are idepedet of each other, j However, a correlatio equal to zero does ot ecessarily imply that y j ad y k are idepedet of each other, as show by the followig umerical example A correlatio jk is idicative of a complete, but iverse depedece of the two variables Numerical example Usig the values i Table 43, matrix R ca easily be computed Each elemet r jk combies, accordig to eq 47, the covariace s jk with variaces s j ad s k : R Matrix R is symmetric, as was matrix S The correlatio r betwee species 3 ad 4 meas that these species are fully, but iversely, depedet (Fig 44a) Correlatios r 8 ad 8 are iterpreted as idicatios of strog depedece betwee species ad 3 (direct) ad species ad 4 (iverse), respectively The zero correlatio betwee species ad the other three species

12 4 Multidimesioal quatitative data 5 4 a 5 4 b y 3 3 y y y c 5 4 y 3 y 3 d y y Figure 44 Numerical example Relatioships betwee species (a) 3 ad 4, (b) ad 4, (c) ad 3, ad (d) ad must be iterpreted with cautio Figure 44d clearly shows that species ad are completely depedet of each other sice they are related by equatio y + (3 y ) ; the zero correlatio is, i this case, a cosequece of the liear model uderlyig statistic r Therefore, oly those correlatios which are sigificatly differet from zero should be cosidered, sice a ull correlatio has o uique iterpretatio Sice the correlatio matrix is the dispersio matrix of stadardized variables, it is possible, as i the case of matrix S (eq 46), to compute R directly by multiplyig the matrix of stadardized data with its traspose: R y y s ' y y y s Z'Z y (49) Table 44 shows how to calculate correlatios r jk of the example as i Table 43, usig this time the stadardized data The mea of each stadardized variable is zero ad its stadard deviatio is equal to uity The dispersio matrix of Z is idetical to the correlatio matrix of Y, which was calculated above usig the covariaces ad variaces

13 Correlatio matrix 43 Table 44 Numerical example Calculatio of stadardized data ad correlatios Sites Origial data Stadardized data Y Z Meas y' z' 4 R y S z Z'Z Matrices ad are related to each other by the diagoal matrix of stadard deviatios of Y This ew matrix, which is specifically desiged for relatig ad, is symbolized by D( ) ad its iverse by D( ) : D ad D p p Usig these two matrices, oe ca write: D D D D (4) where D( ) is the matrix of the diagoal elemets of It follows from eq 4 that: D( ) D( ) (4)

14 44 Multidimesioal quatitative data Sigificace of r The theory uderlyig tests of sigificace is discussed i Sectio I the case of r, iferece about the statistical populatio is i most istaces through the ull hypothesis H : H may also state that has some other value tha zero, which would be derived from ecological hypotheses The geeral formula for testig correlatio coefficiets is give i Sectio 45 (eq 439) The Pearso correlatio coefficiet r jk ivolves two descriptors (eg y j ad y k, hece m whe testig a coefficiet of simple liear correlatio usig eq 439), so that ad The geeral formula the becomes: F r jk r jk r jk r j k (4) where Statistic F is tested agaist F [, ] Sice the square root of a statistic F, is a statistic t whe, r may also be tested usig: t r j k r j k (43) The t statistic is tested agaist the value t [ ] I other words, H is tested by comparig the F (or t) statistic to the value foud i a table of critical values of F (or t) Results of tests with eqs 4 ad 43 are idetical The umber of degrees of freedom is ( ) because calculatig a correlatio coefficiet requires prior estimatio of two parameters, ie the meas of the two populatios (eq 47) H is rejected whe the probability correspodig to F (or t) is smaller tha a predetermied level of sigificace ( for a two-tailed test, ad / for a oe-tailed test; the differece betwee the two types of tests is explaied i Sectio ) I priciple, this test requires that the sample of observatios be draw from a populatio with a bivariate ormal distributio (Sectio 43) Testig for ormality ad multiormality is discussed i Sectio 47, ad ormalizig trasformatios i Sectio 5 Whe the data do ot satisfy the coditio of ormality, t ca be tested by radomizatio, as show i Sectio Test of idepedece of variables It is also possible to test the idepedece of all variables i a data matrix by cosiderig the set of all correlatio coefficiets foud i matrix R The ull hypothesis here is that the p(p )/ coefficiets are all equal to zero, H : R I (uit matrix) Accordig to Bartlett (954), the determiat of R, R, ca be trasformed ito a X (chi-square) test statistic: X [ (p + 5)/6] l R (44) where l R is the atural logarithm of the determiat of R This statistic is approximately distributed as with p(p )/ degrees of freedom Whe the probability associated with X is sigificatly low, the ull hypothesis of complete idepedece of the p descriptors is rejected I priciple, this test requires the

15 Correlatio matrix 45 Table 45 Mai properties of the coefficiet of liear correlatio Some of these properties are discussed i later sectios Properties Sectios The coefficiet of liear correlatio measures the itesity of the liear relatioship betwee two radom variables 4 The coefficiet of liear correlatio betwee two variables ca be calculated usig their respective variaces ad their covariace 4 3 The correlatio matrix is the dispersio matrix of stadardized variables 4 4 The square of the coefficiet of liear correlatio is the coefficiet of determiatio It measures how much of the variace of each variable is explaied by the other 3 5 The coefficiet of liear correlatio is a parameter of a multiormal distributio 43 6 The coefficiet of liear correlatio is the geometric mea of the coefficiets of liear regressio of each variable o the other 3 observatios to be draw from a populatio with a multivariate ormal distributio (Sectio 43) If the ull hypothesis of idepedece of all variables is rejected, the p(p )/ correlatio coefficiets i matrix R may be tested idividually; see Box 3 about multiple testig Other correlatio coefficiets are described i Sectios 45 ad 5 Wherever the coefficiet of liear correlatio must be distiguished from other coefficiets, it is referred to as Pearso's r I other istaces, r is simply called the coefficiet of liear correlatio or correlatio coefficiet Table 45 summarizes the mai properties of this coefficiet

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Multicolliearity diagostics A importat questio that

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Soo King Lim Figure 1: Figure 2: Figure 3: Figure 4: Figure 5: Figure 6: Figure 7:

Soo King Lim Figure 1: Figure 2: Figure 3: Figure 4: Figure 5: Figure 6: Figure 7: 0 Multivariate Cotrol Chart 3 Multivariate Normal Distributio 5 Estimatio of the Mea ad Covariace Matrix 6 Hotellig s Cotrol Chart 6 Hotellig s Square 8 Average Value of k Subgroups 0 Example 3 3 Value

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Session 5. (1) Principal component analysis and Karhunen-Loève transformation 200 Autum semester Patter Iformatio Processig Topic 2 Image compressio by orthogoal trasformatio Sessio 5 () Pricipal compoet aalysis ad Karhue-Loève trasformatio Topic 2 of this course explais the image

More information

General IxJ Contingency Tables

General IxJ Contingency Tables page1 Geeral x Cotigecy Tables We ow geeralize our previous results from the prospective, retrospective ad cross-sectioal studies ad the Poisso samplig case to x cotigecy tables. For such tables, the test

More information

Chi-Squared Tests Math 6070, Spring 2006

Chi-Squared Tests Math 6070, Spring 2006 Chi-Squared Tests Math 6070, Sprig 2006 Davar Khoshevisa Uiversity of Utah February XXX, 2006 Cotets MLE for Goodess-of Fit 2 2 The Multiomial Distributio 3 3 Applicatio to Goodess-of-Fit 6 3 Testig for

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So, 0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

GG313 GEOLOGICAL DATA ANALYSIS

GG313 GEOLOGICAL DATA ANALYSIS GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

(all terms are scalars).the minimization is clearer in sum notation:

(all terms are scalars).the minimization is clearer in sum notation: 7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

More information

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Chi-squared tests Math 6070, Spring 2014

Chi-squared tests Math 6070, Spring 2014 Chi-squared tests Math 6070, Sprig 204 Davar Khoshevisa Uiversity of Utah March, 204 Cotets MLE for goodess-of fit 2 2 The Multivariate ormal distributio 3 3 Cetral limit theorems 5 4 Applicatio to goodess-of-fit

More information

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual

More information

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Additional Notes and Computational Formulas CHAPTER 3

Additional Notes and Computational Formulas CHAPTER 3 Additioal Notes ad Computatioal Formulas APPENDIX CHAPTER 3 1 The Greek capital sigma is the mathematical sig for summatio If we have a sample of observatios say y 1 y 2 y 3 y their sum is y 1 + y 2 +

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

MEASURES OF DISPERSION (VARIABILITY)

MEASURES OF DISPERSION (VARIABILITY) POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral

More information

4. Hypothesis testing (Hotelling s T 2 -statistic)

4. Hypothesis testing (Hotelling s T 2 -statistic) 4. Hypothesis testig (Hotellig s T -statistic) Cosider the test of hypothesis H 0 : = 0 H A = 6= 0 4. The Uio-Itersectio Priciple W accept the hypothesis H 0 as valid if ad oly if H 0 (a) : a T = a T 0

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y 1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these

More information

Sample Size Determination (Two or More Samples)

Sample Size Determination (Two or More Samples) Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines) Dr Maddah NMG 617 M Statistics 11/6/1 Multiple egressio () (Chapter 15, Hies) Test for sigificace of regressio This is a test to determie whether there is a liear relatioship betwee the depedet variable

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION [412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION BY ALAN STUART Divisio of Research Techiques, Lodo School of Ecoomics 1. INTRODUCTION There are several circumstaces

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D. ample ie Estimatio i the Proportioal Haards Model for K-sample or Regressio ettigs cott. Emerso, M.D., Ph.D. ample ie Formula for a Normally Distributed tatistic uppose a statistic is kow to be ormally

More information

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions Chapter 11: Askig ad Aswerig Questios About the Differece of Two Proportios These otes reflect material from our text, Statistics, Learig from Data, First Editio, by Roxy Peck, published by CENGAGE Learig,

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

A proposed discrete distribution for the statistical modeling of

A proposed discrete distribution for the statistical modeling of It. Statistical Ist.: Proc. 58th World Statistical Cogress, 0, Dubli (Sessio CPS047) p.5059 A proposed discrete distributio for the statistical modelig of Likert data Kidd, Marti Cetre for Statistical

More information

Chapter 13, Part A Analysis of Variance and Experimental Design

Chapter 13, Part A Analysis of Variance and Experimental Design Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of

More information

Singular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine

Singular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine Lecture 11 Sigular value decompositio Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaie V1.2 07/12/2018 1 Sigular value decompositio (SVD) at a glace Motivatio: the image of the uit sphere S

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Chimica Inorganica 3

Chimica Inorganica 3 himica Iorgaica Irreducible Represetatios ad haracter Tables Rather tha usig geometrical operatios, it is ofte much more coveiet to employ a ew set of group elemets which are matrices ad to make the rule

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Chapter 12 Correlation

Chapter 12 Correlation Chapter Correlatio Correlatio is very similar to regressio with oe very importat differece. Regressio is used to explore the relatioship betwee a idepedet variable ad a depedet variable, whereas correlatio

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Lesson 11: Simple Linear Regression

Lesson 11: Simple Linear Regression Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested

More information

Linear Regression Models, OLS, Assumptions and Properties

Linear Regression Models, OLS, Assumptions and Properties Chapter 2 Liear Regressio Models, OLS, Assumptios ad Properties 2.1 The Liear Regressio Model The liear regressio model is the sigle most useful tool i the ecoometricia s kit. The multiple regressio model

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j. Eigevalue-Eigevector Istructor: Nam Su Wag eigemcd Ay vector i real Euclidea space of dimesio ca be uiquely epressed as a liear combiatio of liearly idepedet vectors (ie, basis) g j, j,,, α g α g α g α

More information

THE KALMAN FILTER RAUL ROJAS

THE KALMAN FILTER RAUL ROJAS THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable. Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics

BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics BIOTAT 640 Itermediate Biostatistics Frequetly Asked Questios Topic FAQ Review of BIOTAT 540 Itroductory Biostatistics. I m cofused about the jargo ad otatio, especially populatio versus sample. Could

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row: Math 5-4 Tue Feb 4 Cotiue with sectio 36 Determiats The effective way to compute determiats for larger-sized matrices without lots of zeroes is to ot use the defiitio, but rather to use the followig facts,

More information

Stat 200 -Testing Summary Page 1

Stat 200 -Testing Summary Page 1 Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece

More information

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram. Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

11 THE GMM ESTIMATION

11 THE GMM ESTIMATION Cotets THE GMM ESTIMATION 2. Cosistecy ad Asymptotic Normality..................... 3.2 Regularity Coditios ad Idetificatio..................... 4.3 The GMM Iterpretatio of the OLS Estimatio.................

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

M1 for method for S xy. M1 for method for at least one of S xx or S yy. A1 for at least one of S xy, S xx, S yy correct. M1 for structure of r

M1 for method for S xy. M1 for method for at least one of S xx or S yy. A1 for at least one of S xy, S xx, S yy correct. M1 for structure of r Questio 1 (i) EITHER: 1 S xy = xy x y = 198.56 1 19.8 140.4 =.44 x x = 1411.66 1 19.8 = 15.657 1 S xx = y y = 1417.88 1 140.4 = 9.869 14 Sxy -.44 r = = SxxSyy 15.6579.869 = 0.76 1 S yy = 14 14 M1 for method

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Lecture 6 Simple alternatives and the Neyman-Pearson lemma STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull

More information

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Chapter 13: Tests of Hypothesis Section 13.1 Introduction Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed

More information

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Inverse Matrix. A meaning that matrix B is an inverse of matrix A. Iverse Matrix Two square matrices A ad B of dimesios are called iverses to oe aother if the followig holds, AB BA I (11) The otio is dual but we ofte write 1 B A meaig that matrix B is a iverse of matrix

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information