Title: Bounds and normalization of the composite linkage disequilibrium coefficient.

Size: px
Start display at page:

Download "Title: Bounds and normalization of the composite linkage disequilibrium coefficient."

Transcription

1 Ttle: ounds and normalzaton of the composte lnkage dsequlbrum coeffcent. (Genetc Epdemology 004 7:5-57 uthor: Dmtr V. Zaykn * * GlaxoSmthKlne Inc., Research Trangle Park, NC Contact detals: Dmtr Zaykn Natonal Insttute of Envronmental Health Scences Natonal Insttutes of Health Research Trangle Park, NC 7709 emal: zayknd@nehs.nh.gov Runnng ttle: Composte LD bounds

2 bstract The composte lnkage dsequlbrum (LD measure s often calculated for two-locus genotypc data, especally when couplng and repulson double heterozygotes cannot be dstngushed. Ths measure has been reported to have good statstcal propertes and was suggested for routne testng of LD regardless of Hardy-Wenberg equlbrum at ether of two loc [Wer, 979, Schad, 004]. However, the bounds for ths measure have not been yet reported. These bounds are derved here as functons of one-locus genotype or allele frequences. They provde standardzed measures of composte lnkage dsequlbrum defned as the proporton of ts mum attanable value gven observed allele or genotype frequences.

3 3 Introducton Calculaton of the composte dsequlbrum coeffcent,, for measurng assocaton between alleles and at two loc s a routne step n analyss of multlocus genotypc data [Wer 979, Wer and Cockerham, 979, Wer, 996]. Defnton of the composte coeffcent s P + P p p, where P s the frequency of gamete, P s the jont frequency of alleles and at two dfferent gametes, and p, p are the frequences of alleles and at two loc [Wer 996]. Ths coeffcent s drectly estmated from two-locus genotypc data and under random matng corresponds to the usual measure of lnkage dsequlbrum, D P p p. Ths s because the nongametc frequency reaches the equlbrum P p p after one generaton of random matng. Wer [996] suggests the defnton n whch the focus s on two alleles at a tme,,, and other alleles at these two loc are combned and collectvely referred to as a, b wth frequences p,( p. Wer [996] defnes the correlaton coeffcent, r (, whch s one possble normalzaton of, r ( p ( p + D ( p ( p + D where D, D are the estmates of Hardy-Wenberg dsequlbrum coeffcents and p, are sample allele frequences at loc,. It s most useful n that t drectly p relates to the asymptotc ch-square test statstc wth one degree of freedom for testng the hypothess 0. The relaton s X, where n s the number of ndvduals nr n the sample. Constrants for the frequences of genotypes and alleles at each locus

4 4 mpose bounds for the mnmum and mum values of r. s the result, the range of possble values of r s smaller than (- to. In certan stuatons, t mght be s useful to report values of as the proporton of ts mum possble value,. For example, Clark et al. [003] evaluated the dstrbuton of ths standardzed measure across 4,833 SNPs n the human genome, however they obtaned the bounds ( numercally. The purpose of ths report s to derve the bounds on. The standardzed measure s defned as and ranges between and. The standardzaton makes the new measure ndependent of sngle locus frequences, n the sense of the range that the coeffcent can take. The allele frequences are very much part of the usual normalzed LD defned n the same way, D D 988] and t would be mstaken to nterpret ether D or dependences on the allele or genotype frequences. D, [Hedrck, 987, Lewontn, as beng free of Statstcal Methods Usng the notaton of Wer [996], the composte LD coeffcent s estmated from dlocus counts and sample allele frequences as (n + nb + na + nab p p ( n where n s the number of ndvduals n the sample, and n,..., n ab denote d-locus sample genotype counts. One way to derve bounds for the sample value of s to use

5 5 the relaton C(, (n x y where C( x, y n n n n x y x y, and x,y are vectors of genotype values for two loc, re-coded as f genotype s x 0 f genotype s a f genotype s y f genotype s 0 f genotype s b f genotype s The correspondence between and C( x, y can be shown by wrtng the composte LD coeffcent n an alternatve form: ( + 4 n ( n n ( n n a n ( n + n n n ( p p + p p p p p p ab + n + n b n n a n n ( p p ( p p a ( n n ( n n ( b b b a Sums n C( x, y can be wrtten n terms of the genotype counts as x y n n ( ( + n n n ( ( + + n n (( + n (( x y ( n n ( n n Therefore, C x, y n( n n n + n ( n n ( n n. Dvdng ths ( by n, the relaton to the composte LD as defned n ( s C( x, y (n. Sample allele frequences are related to x, y as follows:

6 6 p p n + n n n + n n a b x + n n y + n n n x n y p p Therefore, by takng expectatons of these expressons, E( x p and E( y p. It can be further verfed that Cov( x, y E( x, y Cov( x, y + E( x E( y P + ( p P P ( p + P Var( x ( P Var( y ( P [ P P + Pa ( Pa ] [ p ( p + D ] + P [ PP + Pb ( Pb ] [ p ( p + D ] + P ( P ( P P P where D P p and D P p are Hardy-Wenberg dsequlbrum coeffcents [Wer, 996] and P,..., P are frequences of genotypes,,. Then Corr ( x, y ( p ( p + D ( p ( p + D The ndcator varables x, y are mnus the sums of those gven by Wer [979], ( x + x and y + y. These varables keep track of the number of copes of and ( on the two gametes. Wer showed that the correlaton of the sums s the same as the expresson for Corr( x, y.

7 7 When > 0, the pars { x, y } and { x, y } ncrease the resultng value of the cross-product, x y, whle the pars { x, y 0},{ x 0, y } do not add to the value, and the pars { x, y },{ x, y } decrease the value. Ths suggests a way to derve the bounds for by fndng the permutaton of x relatve to y that wll mze the absolute value of, or equvalently, C( x, y. These bounds are for fxed one-locus counts. The permutaton should mze the number of x, y } pars wth the same sgn of x and y and dstrbute the remanng x wth y n { such a way that wll not decrease the value of the cross-product. The second term n C( x, y,.e. the product x y, s not affected by the way the vectors are permuted. When < 0, the pars x, y } are arranged n a way that mzes the number of pars wth the dfferent sgn for x and y. { Consder the calculaton for the case > 0 n more detal. The mum possble number of pars of the same sgn s d mn( n, n + mn( n, n ( whch s the sum of the largest possble number of pars { x, y } and { x, y }. The remanng number of pars s ( n d, so potentally ths s the amount by whch the value of the cross-product gven by ( can be reduced. However, these pars can be arranged so that as many values of x 0 or y 0 (heterozygotes as

8 8 possble are matched n pars wth the second of the two values beng dfferent from zero. Then the overall reducton of the ( from the value n ( s s n d [( n d ( n a + nb,0] mn( n d, n + n a b Therefore, the mum value for the cross-product s ( d s and the bound for > 0 s For the case d s n n d s a n d s n ( ( < 0 ( n n ( n n ( p p ( p p ( p ( p b, the mnmum s obtaned the same way, but the value d s calculated as d mn( n, n + mn( n, n to match x, y } pars wth the opposte sgns. Ths gves the mum absolute value: ( d + s n d s + n ( p ( p ( p ( p { Puttng together, the possble values of genotype counts as are bounded by functons of sngle-locus d s + n n where d mn( n d s n n where d mn( n ( n n ( n n, n, ( n n ( n n, n + mn( n + mn( n, n, n, < 0 > 0 (3

9 9 where s n d mn( n d, n a + n b. The second part of the bounds s a functon of sample allele frequences, n ( n n ( n n ( p ( p. The populaton value as the functon of one-locus populaton frequences s d s + where d mn( P d s where d mn( P ( p ( p, P ( p ( p, P + mn( P + mn( P, P, P,, < 0 > 0 (4 and s d mn( d, P a + P. b The sample standardzed value wth bounds as gven n (3 s. δ Comparson of Standardzed Coeffcents The correspondence between the values of (mum lkelhood estmator based on the HWE assumpton and the standardzed composte LD n large samples should be noted. The problem wth the drect comparson of and D s that D bounds are for the fxed frequences of alleles, whereas bounds are for the one-locus frequences of genotypes. In ths sense, δ s more comparable to r whch ncorporates varances that nclude one-locus devatons from HWE. Numercally, δ and r coeffcents are closely correlated, although δ D δ s more stretched because t can stll reach extreme (- to values even f allele frequences at two loc are unequal. Ths s not necessarly a vrtue δ

10 0 of the standardzed measure ( δ. The coeffcent r has well-defned statstcal and populaton-genetc propertes. It s useful n contexts where an allele at one locus s to be regarded as a proxy for an allele at another locus, for example durng the selecton of markers for assocaton studes [e.g. Meng et al., 003]. Ths requres dependency (LD as well as the closeness of sngle-locus frequences whch s then reflected by large absolute values of. r To make a far comparson wth D, equaton (3 s modfed so that the bounds are calculated as the mum possble values gven frequences of alleles rather than genotypes. In ths case the standardzed composte LD can be compared to drectly. D To obtan these bounds, the new values of ( d s (n n (3 should be computed. Ths s done by replacng the sngle-locus counts n d wth ther mum values gven the counts of alleles and notcng that n ths case s n d, so that ( d s (n (d n (n. Then the bounds are d n ( ( + p p n, n nb na n where d mn(, + mn(, d n ( ( p p n, n n na nb where d mn(, + mn(, < 0 > 0 fter expressng counts va sample frequences, ths further smplfes to mn mn [ p p, ( p ( ], [( p p p, p ( p ], < 0 > 0 (5

11 Therefore, these bounds are twce the bounds for D. Defne the standardzed coeffcent as, when the bounds are computed for the fxed allele frequences, usng (5. When the populaton s n HWE, the two estmators are related as. The reason for the value to be twce as D small s that the composte dsequlbrum s a sum of the usual LD, measured by D plus the covarance between alleles at two dfferent chromosomes, D. Ths second term s zero f the populaton s n HWE, however the mum value of stll needs to account for D. One can only clam that reached a certan proporton of ts mum value, wthout attrbutng that proporton to ether the gametc or the nongametc part. The allele frequences n and D are estmated n the same way, therefore the effcency of these two coeffcents n estmaton of the populaton value of D largely determned by the performance of the correspondng LD estmators. Schad (004 argued for superorty of the composte LD by examnng propertes of the test H : 0 D 0. Therefore t s expected that performs well compared to D. When the populaton s not n HWE, s stll a vald estmator for ( D + where the second term s the normalzed D. Not unexpectedly, D s not an approprate estmator. Fgure s an llustraton of ths. To create each data pont on the graphs, ten possble d-locus populaton genotype frequences were drawn from the Drchlet D s

12 dstrbuton wth all ten parameters equal to ¼ pror to obtanng each multnomal sample of 500 ndvduals. The Drchlet(¼,,¼ samplng creates a nearly unform (- to dstrbuton of and across populatons from whch each multnomal sample s D D obtaned. Fgure s a smlar plot wth all populaton dsequlbrum due to the gametc part, D. Such populatons are created by samplng four gamete frequences from the Drchlet(,,, dstrbuton and parng them at random. The resultng dstrbuton of D s close to unform on (- to. The estmator D was obtaned by numercally solvng the lkelhood formed under the assumpton of HWE. Such soluton s feasble n the case of two loc and s preferred to the common alternatve usng an EM algorthm [Wer, Cockerham, 979]. Under HWE (Fgure the coeffcent D s estmatng half of the gametc LD term,. oth fgures mply that s performng well as an estmator of the populaton value, ( D +, whle abs( s takng many possble values between D D abs( D + D and when the populaton s not n equlbrum. Performance of further mproves wth the ncreased sample sze (data not shown. Note that Fgure samplng of d-locus genotypes results n devatons from equlbrum at the level of two loc, whch ncludes non-zero, sngle-locus HWE devatons, as well as the hgher order dsequlbra. Wer [996] gves defntons of all correspondng coeffcents. Thus, estmator appears to perform well even when two-locus frequences devate from the equlbrum condtons. D

13 3 Dscusson s wth the usual normalzed LD coeffcent, D, cauton should be taken when values of or are compared for two pars of loc, or when values for the same par of δ loc are compared between dfferent populatons. Smlar evolutonary forces wll not guarantee that normalzed coeffcents should attan smlar values gven two dfferent sets of allele frequences [Lewontn, 988]. Indeed, t s the values at the bounds that are completely determned by the sngle-locus parameters, and values n the mddle reman ndetermnate. Nevertheless, the standardzed coeffcent can be useful n the sense of ts defnton as the proportonal measure of strength of assocaton between alleles at two loc. It s worthwhle to note that the nter-gametc coeffcent D can be non-zero f samplng s condtonal on the phenotype, such as the case-control samplng. If alleles, are jontly predctve for the case category, the value D can be non-zero among cases, even f the populaton value s zero. If the prevalence of the case category s w, the allelc prevalence for the allele can be defned as w j P j w j, where j ndexes genotypes and P j, w j are the genotype frequency and ts susceptblty. Suppose the case probablty for the ndvduals carryng both alleles and s w,. Then, among the cases, D wp w, p pww whch s non-zero f w, ww. Smlarly, w D n cases can be away from the populaton value. The composte coeffcent as well

14 4 as ts standardzed value can be examned n cases and compared to dsequlbrum n controls or the populaton value. On the other hand, when the gametc phase n unknown, the lkelhood (wth HWE assumpton used for the calculaton of D and D, e.g. by the means of EM algorthm, s generally not sutable for evaluaton of dsequlbrum n cases. Ths s because the equlbrum proportons n cases (ncludng sngle-locus HWE are lkely to be dstorted [Nelsen et al., 999]. Schad [004] showed that n such stuatons the ncorrect HWE assumpton can lead to grossly based results of the test H : 0 D 0, wth ether extremely conservatve or lberal type-i error rates. On the other hand, tests based on the composte coeffcent have optmal power and mantan the correct sze.

15 5 References Clark G, Nelsen R, Sgnorovtch, J, Matse TC, Glanowsk S, Hel J, Wnn-Deen ES, Holden L, La E Lnkage dsequlbrum and nference of ancestral recombnaton n 538 sngle-nucleotde polymorphsm clusters across the human genome. m J Hum Genet 73: Hedrck PH Gametc dsequlbrum measures: proceed wth cauton. Genetcs 7: Lewontn RC On measures of gametc dsequlbrum. Genetcs 0: Meng Z, Zaykn DV, Xu CF, Wagner M, Ehm MG. 003 Selecton of genetc markers for assocaton analyses, usng lnkage dsequlbrum and haplotypes m J Hum Genet 73: Nelsen DM, Ehm MG, Wer S Detectng marker-dsease assocaton by testng for Hardy-Wenberg dsequlbrum at a marker locus. m J Human Genet. 63: Schad DJ 004. Lnkage dsequlbrum testng when lnkage phase s unknown. Genetcs 66:

16 6 Wer S Genetc Data nalyss II, Snauer ssocates, Inc. Wer S Inferences about lnkage dsequlbrum. ometrcs 35: Wer S, Cockerham CC Estmaton of lnkage dsequlbrum n randomly matng populatons. Heredty 4:05.

17 7 cknowledgements Two revewers provded useful comments that mproved qualty of ths manuscrpt.

18 8 Fgure legends Fgure. Left fgure: plot of the populaton value of ( D + D vs. the sample value of. Rght fgure: plot of the populaton value of ( D + vs. the sample value of D. D Fgure. Left fgure: plot of the populaton value of vs. the sample value of. Rght fgure: plot of the populaton value of vs. the sample value of D. D D

19 Fgure. 9

20 Fgure. 0

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable

More information

Recall that quantitative genetics is based on the extension of Mendelian principles to polygenic traits.

Recall that quantitative genetics is based on the extension of Mendelian principles to polygenic traits. BIOSTT/STT551, Statstcal enetcs II: Quanttatve Trats Wnter 004 Sources of varaton for multlocus trats and Handout Readng: Chapter 5 and 6. Extensons to Multlocus trats Recall that quanttatve genetcs s

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Quantitative Genetic Models Least Squares Genetic Model. Hardy-Weinberg (1908) Principle. change of allele & genotype frequency over generations

Quantitative Genetic Models Least Squares Genetic Model. Hardy-Weinberg (1908) Principle. change of allele & genotype frequency over generations Quanttatve Genetc Models Least Squares Genetc Model Hardy-Wenberg (1908) Prncple partton of effects P = G + E + G E P s phenotypc effect G s genetc effect E s envronmental effect G E s nteracton effect

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Checking Pairwise Relationships. Lecture 19 Biostatistics 666

Checking Pairwise Relationships. Lecture 19 Biostatistics 666 Checkng Parwse Relatonshps Lecture 19 Bostatstcs 666 Last Lecture: Markov Model for Multpont Analyss X X X 1 3 X M P X 1 I P X I P X 3 I P X M I 1 3 M I 1 I I 3 I M P I I P I 3 I P... 1 IBD states along

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j Stat 642, Lecture notes for 01/27/05 18 Rate Standardzaton Contnued: Note that f T n t where T s the cumulatve follow-up tme and n s the number of subjects at rsk at the mdpont or nterval, and d s the

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Estimation of the Mean of Truncated Exponential Distribution

Estimation of the Mean of Truncated Exponential Distribution Journal of Mathematcs and Statstcs 4 (4): 84-88, 008 ISSN 549-644 008 Scence Publcatons Estmaton of the Mean of Truncated Exponental Dstrbuton Fars Muslm Al-Athar Department of Mathematcs, Faculty of Scence,

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information

Supplementary Notes for Chapter 9 Mixture Thermodynamics

Supplementary Notes for Chapter 9 Mixture Thermodynamics Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects

More information

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2 ISQS 6348 Fnal Open notes, no books. Ponts out of 100 n parentheses. 1. The followng path dagram s gven: ε 1 Y 1 ε F Y 1.A. (10) Wrte down the usual model and assumptons that are mpled by ths dagram. Soluton:

More information

RELIABILITY ASSESSMENT

RELIABILITY ASSESSMENT CHAPTER Rsk Analyss n Engneerng and Economcs RELIABILITY ASSESSMENT A. J. Clark School of Engneerng Department of Cvl and Envronmental Engneerng 4a CHAPMAN HALL/CRC Rsk Analyss for Engneerng Department

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected. ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Adjusted Control Lmts for U Charts Copyrght 207 by Taylor Enterprses, Inc., All Rghts Reserved. Adjusted Control Lmts for U Charts Dr. Wayne A. Taylor Abstract: U charts are used

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

a. (All your answers should be in the letter!

a. (All your answers should be in the letter! Econ 301 Blkent Unversty Taskn Econometrcs Department of Economcs Md Term Exam I November 8, 015 Name For each hypothess testng n the exam complete the followng steps: Indcate the test statstc, ts crtcal

More information

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

β0 + β1xi and want to estimate the unknown

β0 + β1xi and want to estimate the unknown SLR Models Estmaton Those OLS Estmates Estmators (e ante) v. estmates (e post) The Smple Lnear Regresson (SLR) Condtons -4 An Asde: The Populaton Regresson Functon B and B are Lnear Estmators (condtonal

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics ) Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton

More information

Ken G Dodds AgResearch, Invermay Agricultural Centre, Mosgiel, New Zealand

Ken G Dodds AgResearch, Invermay Agricultural Centre, Mosgiel, New Zealand STATISTICAL GENETICS Ken G Dodds AgResearch, Invermay Agrcultural Centre, Mosgel, New Zealand Keywords: Quanttatve Genetcs, Relatedness, Inbreedng, Breedng, Hertablty, Selecton, Lnkage, Mappng, Genetc

More information

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3. Outlne 3. Multple Regresson Analyss: Estmaton I. Motvaton II. Mechancs and Interpretaton of OLS Read Wooldrdge (013), Chapter 3. III. Expected Values of the OLS IV. Varances of the OLS V. The Gauss Markov

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors Multple Lnear and Polynomal Regresson wth Statstcal Analyss Gven a set of data of measured (or observed) values of a dependent varable: y versus n ndependent varables x 1, x, x n, multple lnear regresson

More information

Chapter 20 Duration Analysis

Chapter 20 Duration Analysis Chapter 20 Duraton Analyss Duraton: tme elapsed untl a certan event occurs (weeks unemployed, months spent on welfare). Survval analyss: duraton of nterest s survval tme of a subject, begn n an ntal state

More information

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method Maxmzng Overlap of Large Prmary Samplng Unts n Repeated Samplng: A comparson of Ernst s Method wth Ohlsson s Method Red Rottach and Padrac Murphy 1 U.S. Census Bureau 4600 Slver Hll Road, Washngton DC

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi LOGIT ANALYSIS A.K. VASISHT Indan Agrcultural Statstcs Research Insttute, Lbrary Avenue, New Delh-0 02 amtvassht@asr.res.n. Introducton In dummy regresson varable models, t s assumed mplctly that the dependent

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information