GENOMIC SELECTION ADDITIONAL TOPICS
|
|
- Eunice Lloyd
- 5 years ago
- Views:
Transcription
1 GENOMIC SELECTION ADDITIONAL TOPICS
2 OUTLINE Æ INTRODUCTION w Some Bascs of Regresson n Hgh-dmensonal Problems Æ BAYESIAN ALTERNATIVE w A Quck Tour on Bayesan Models Commonly Used n Genomc Selecton Æ COMPARISON OF GWMAS MODELS w Whole Genome Predcton Wthn and Across Envronments: An Alcaton to Wheat Yeld Æ MODELLING NON-ADDITIVE GENETIC EFFECTS w A Quck Tour on Sem-arametrc Kernel-based Methods w An Eamle Usng Nonarametrc Methods Æ SELECTIVE GENOTYPING w The Effect of Selectvely Genotyng Indvduals n Genomc Selecton
3 INTRODUCTION Some Bascs of Regresson n Hgh-dmensonal Problems
4 GENOME-WIDE MARKER-ASSISTED SELECTION
5 GENOME-WIDE MARKER-ASSISTED SELECTION (Meuwssen et al., 00 y = µ + g + g g + e Marker genotyes genetc effects Genomc EBV: GEBV = ĝ + ĝ ĝ = = ĝ ð bg small n aradgm ð Dmenson reducton technques (e.g. SVD and PLS, and stewse strateges. ð Alternatvely: enalzed regresson, shrnkage estmaton.
6 Lnear Regresson and Least Squares E[Y] = f (X = β + 0 X = β Tranng data: (, y ( N, yn ; = (,,, ' RSS( β N N = 0 = = = ( y f ( = y β β = ( y Xβ'( y Xβ RSS β = X'( y Xβ ; RSS β β' = X' X X'( y Xβ = 0 βˆ = ( X' X X' y and yˆ = Xβˆ = X( X' X X' y H hat matr, or roecton matr
7 Lnear Regresson and Least Squares Some addtonal assumtons: Cov(y, y Var(y ' = σ = 0 :fed (non random Var(ˆ β = ( X' X σˆ = N N = σ ( y ŷ [ Note : E[ σˆ ] = σ ] Moreover: Y = E[Y X] + ε = β0 + X β d ε ~ N(0, σ H = + ε ˆ β ~ N( β,( X' X σ (N σˆ ~ σ χn th - 0 : β = 0 z = ~ t N, where ν = dagonal element of ( X' X σˆ βˆ ν
8 Gauss-Markov Theorem Lnear combnaton of the arameters: θ = k'β LS: θ ˆ = k' βˆ = k'( X' X X' y E[ k' βˆ] = k'( X' X X' Xβ = k' β ~ θ = c ' y, wth E[ c' y] = k' β (.e., unbased Var( k' βˆ Var( c' y Mean squared error (MSE: ~ MSE( ~ θ = E( θ θ ~ = Var( θ + [E( ~ θ θ] varance squared bas
9 Least Squares Estmaton ð DRAWBACKS: Predcton accuracy: unbased but large varance Interretaton ð SOLUTION: Feature (varable selecton Best subset regresson: mnrss Stewse selecton (forward, backward, hybrd Shrnkage methods Rdge regresson and LASSO Other technques Prncal comonents regresson Partal least squares (PLS
10 Rdge Regresson λ 0 (comlety arameter βˆ rdge N = β β + λ arg mn y 0 β β = = = or, equvalently: βˆ rdge = arg mn subect to : β = 0 = = β N y s β β, βˆ 0 = y = y / N after centerng y and 's (.e., y y and RSS( λ = ( y Xβ' ( y Xβ + λβ' β βˆ = ( X' X + λi rdge X' y
11 LASSO βˆ lasso = arg mn β N y β 0 = = β, subect to: = β t Estmaton cture for the lasso (left and rdge regresson (rght. The sold blue areas are the constrant regons β + β t (lasso and β +β t (rdge regresson, whle the red ellses are the contours of the least squares error functon.
12 Shrnkage Estmators: Generalzaton 0 q, y arg mn ˆ q N 0 β + λ β β = = = = β β Contours of constant value of for gven values of q. β q
13 Model Selecton ð GOODNESS-OF-FIT VS. MODEL COMPLEXITY Over-reducton Over-ft F Bas-varance tradeoff
14 Model Selecton ð Goodness-of-ft lkelhood rato aroach (LRT; nested models LRT = ln L L ~ χ ( ð Model comlety number of free arameters, (effectve number Lnear (regularzed fttng: y ˆ = Sy = trace( S
15 Effectve Number of Parameters Eamle wth a smle lnear regresson: [ ] e β y X + = + +β = β e y 0 [ ] = n ( n ' ( X X = n n n n n k ' ( X X X = n n n n n n n n n n n n ( n ( n ( n n ( n ( n ( n k = n X'X [ ] [ ] [ ] [ ] k k ( n k n k ' ' ( trace = = = + = X X X X = ' ' ( X X X X ' det( / k X X =
16 Model Selecton ð Balancng goodness-of-ft and comlety Akake nformaton crteron (AIC: AIC = ln( L Bayesan nformaton crteron (BIC: F If (or Schwarz Crteron e d AIC = ~ N(0, σ e + n ln then: RSS n and BIC= ln(n BIC = σ e RSS ln( L + ln( L
17 Model Selecton Eamle: lnear vs. quadratc regresson y = β +β + e 0 y = β + β + β + e 0 ŷ = ŷ = R R σˆ ad e = 0.53 = 0.30 = 0.35 R R σˆ ad e = 0.70 = 0.0 = 0.45
18 Predctve Ablty Behavor of test samle and tranng samle error as the model comlety s vared.
19 CROSS-VALIDATION (Predctve Ablty ð K-FOLD Tranng set Testng set ð LEAVE-ONE-OUT ( n-fold
20 LOOCV Lnear Quadratc Obs Lnear Quadr PRESS
21 GWMAS ð MAIN (STATISTICAL/COMPUTATIONAL CHALLENGES Curse of dmensonalty How to deal wth non-addtve models ð TWO BASIC APPROACHES Elct regresson of Y on M: Imlct regresson usng RKHS: u = f (, β = u = f ' β Cov(f,f K(, ; e.g., K(,
22 BAYESIAN ALTERNATIVE A Quck Tour on Bayesan Models Commonly Used n Genomc Selecton
23 GWMAS (BLUP ð Model: y = µ + X g + e = Marker effects assumed normally dstrbuted wth a common varance,.e.: g ~ N(0, σ 0 Estmates: ð How to choose? Arbtrary; but controls amount of shrnkage Alternatve: set, where s an estmate (ror of total addtve genetc varance
24 BAYES A (Meuwssen et al. 00 y = µ + X g + e = y µ, g, σe ~ N( µ + X g =, Iσ e g σ ~ N(0, σ ð Pror dstrbutons: σ ~ χ ( ν,s (scaled nverted ch-square dstrbuton wth scale arameter S and ν degrees of freedom σ e ~ χ (,0
25 y = µ + X g + e = BAYES B (Meuwssen et al. 00 y µ, g, σe ~ N( µ + X g =, Iσ e g σ ~ N(0, σ ð Pror dstrbutons: σ = σ ~ 0 χ wth robablty π ( ν,s wth robablty ( - π σ e ~ χ (,0
26 BAYES B * y = µ + X g + e = y µ, g, σe ~ N( µ + X g =, Iσ e ð Pror dstrbutons: σ g = 0 wth robablty π g ~ σ χ ~ N(0, σ ( ν,s wth robablty ( - π σ e ~ χ (,0
27 BAYES B ** y = µ + X g + e = y µ, g, σe ~ N( µ + X g =, Iσ e ð Pror dstrbutons: σ g g ~ σ σ χ ~ N(0,c 0 ~ N(0, σ ( ν,s wth robablty π wth robablty ( - π σ e ~ χ (,0
28 BAYES C y = µ + X g + e = y µ, g, σe ~ N( µ + X g =, Iσ e ð Pror dstrbutons: g = 0 wth robablty π g σ g ~ N(0, σ g wth robablty ( - π π ~ Unform(0, σ g ~ χ (ν,s σ e ~ χ (,0
29 BAYESIAN LASSO y = µ + X g + e = y µ, g, σe ~ N( µ + X g =, Iσ e g σ ~ N(0, σ ð Pror dstrbutons: σ ~ Eonental( λ σ e ~ χ (,0
30 GBLUP Regresson wth genetc effects wth normal dstrbuton wth common varance y = µ + X g + e =, wth: g σ g ~ N(0, σ g Equvalent Model y = µ + b + e, wth: b σ b ~ N(0,Gσ b G s the genomc relatonsh matr: # & G = % ( ( $ = ' (X M(X M'
31 ssgblup Sngle med model wth all anmals (genotyed and non-genotyed ncluded, wth matr A relaced by H: " H = A + $ G # $ A % ' &'
32 COMPARISON OF GWMAS MODELS Whole Genome Predcton Wthn and Across Envronments: An Alcaton to Wheat Yeld Duan, H. 0 (Master Thess
33 Data Descrton Global Wheat Program of the Internatonal Maze and Wheat Imrovement Center (CIMMYT 599 wheat lnes wth,79 markers genotyed Global envronments were groued nto four macroenvronments (ME, ME, ME 3, and ME 4 Standardzed wth mean of 0 and standard devaton of Research Goal Assess erformance of dfferent models (Bayes A, BayesB, Bayes C, Bayesan LASSO, and Bayesn Rdge for redcton wthn and across envronment (G E nteracton
34 Bolot of standardzed wheat yelds n each macroenvronment.
35 Scatterlots of wheat yelds for each ar of macroenvronments.
36 Genotye macroenvronment nteracton: atterns of yeld across envronments for the 599 wheat lnes.
37 Correlatons between observed and redcted yeld wthn macroenvronments.
38 Correlatons between observed and redcted yeld across macroenvronments; Models: Bayes A and Bayes C.
39 Hgh Throughut Comutng The Condor at UW-Madson: htt://research.cs.wsc.edu/condor/
40 MODELLING NON-ADDITIVE GENETIC EFFECTS A Quck Tour on Semarametrc Kernel-based Methods
41 GWMAS (Includng Non-Addtve Genetc Effects ð Many studes that attemt to dentfy the genetc bass of comle trats gnore the ossblty that loc nteract, deste ts known substantal contrbuton to genetc varaton (Carborg and Haley 005. ð Dekkers and Hostal (00 also onted out that estng statstcal methods for marker-asssted selecton do not deal well wth comlety osed by quanttatve trats, among of the reasons they cte the nadequate handlng of non-addtvty. ð To address ths ssue, etensons of the GWMAS model to accommodate domnance and some level of estass have been roosed, as dscussed net.
42 ð Etensons of the GWMAS model to accommodate domnance and some level of estass have been roosed (Y et al. 003; Huang et al. 007; Xu 007, whch can be descrbed as: g ' y = µ + X g + X ' g ' + e = where the refer to nteracton terms relatve to estatc effects nvolvng loc and, and reresent arorate desgn matrces. X ' ð In the case of dallelc loc such as SNPs, each row of X g can be factorze ' nto addtve and domnance effects as g = α + ( δ, where = -, 0 or for the three ossble genotyes aa, Aa and AA, resectvely, and α and δ reresent the addtve and domnance effects relatve to loc. ð Smlarly, the four degrees of freedom relatve to each arwse nteracton between dallelc loc can be descrbed as: ' ' g ' = ' αα ' + ' ( αδ ' + ( ' δα ' + ( ' ( where αα ', αδ ', δα ', and δδ ' reresent addtve addtve, addtve domnance, domnance addtve, and domnance domnance estass between loc and. ' > δδ '
43 SEMIPARAMETRIC APPROACHES ð The non-addtve GWMAS model resented reles on very strong assumtons, such as lnearty, multvarate normalty, and roorton of segregatng loc (Ganola et al. 006 ð The genome seems to be much more hghly nteractve than what such models can accommodate. The number of hgher-order nteractons (.e., mult-loc estatc effects grows etremely quckly wth the ncrease on the number of markers ð Partton of genetc varance nto orthogonal addtve, domnance, addtve addtve, addtve domnance, etc. comonents s ossble only under hghly dealzed, unrealstc condtons (Cockerham 954; Kemthorne 954 ð Ganola et al. (006 and Ganola and van Kaam (008 roosed sem-arametrc aroaches usng kernel regresson and reroducng kernel Hlbert saces (RKHS regresson rocedures embedded nto standard med-effects lnear models
44 ð RKHS Regresson ð Non-arametrc reresentaton of f( : ð Bayesan RKHS regresson σ σ = f 0 0, N I K 0 0 ε f, ε f y + µ + = = σ = σ, K(, K(, K(, K(, K(, K( Cov( n n n n f f K f ' n ],f (, f ( [ f = f ε + = y GWMAS: Imlct Regresson usng RKHS kernel
45 Choosng the Reroducng Kernel ð Stes n creatng a kernel: Defne a noton of dstance n nut sace Ma from dstance to covarance structure ð Eamles: Inut sace: edgree nformaton K = A = or K = D, Inut sace: genotye nformaton etc. Indvdual Genotye AA Aa 3 Aa K * = 0.5 or K 0.5 * = , etc.
46 RKHS Regresson: Loss functon (e.g. logl, RSS Smoothng arameter Model comlety measurement; squared norm n Hlbert sace Kmeldorf and Wahba 970 nn ostve defnte matr whch elements are evaluatons of a RK n vector of unknown constants
47 Bayesan RKHS Anmal Model u = Kc
48 MODELLING NON-ADDITIVE GENETIC EFFECTS An Eamle Usng Nonarametrc Methods Gonzalez-Reco O, Ganola D, Long N, Wegel KA, Rosa GJM and Avendano S. Nonarametrc methods for ncororatng genomc nformaton nto genetc evaluatons: an alcaton to mortalty n brolers. Genetcs 78(4: , 008.
49 SEMIPARAMETRIC APPROACHES (Gonzalez-Reco et al. 008 ð Fve aroaches comared: F -metrc (lnear regresson model Kernel regresson RKHS regresson Bayesan regresson; wth,000 SNPs Standard E-BLUP genetc evaluaton wth 4 re-selected SNPs ð Resonse varable: late mortalty (4-4 days of age ð Phenotyc data:,67 rogeny of 00 sres
50 RESULTS ð Hertablty estmate (h = 0.0 from the E-BLUP aroach suggest that genetc evaluaton could be mroved f sutable molecular markers were avalable. ð Posteror means and standard devatons of the resdual varance suggest that RKHS accounted for more varance n the data.
51 ð The two nonarametrc methods ftted the data better, havng lower RSS.
52 RESULTS ð Predctve ablty ndcated advantages of the RKHS aroach relatve to other methods.
53 SELECTIVE GENOTYPING The Effect of Selectvely Genotyng Indvduals n Genomc Selecton Bolgon AA, Long N, Albuquerque LG, Wegel KA, Ganola D and Rosa GJM. Comarson of selectve genotyng strateges for redcton of breedng values n a oulaton undergong selecton. Journal of Anmal Scence, 0 (n ress.
54 Introducton Genomc selecton: Genome-enabled redcton of breedng values Genotyng cost: Genotye subset of anmals Effect of redcton accuracy? Obectve: To evaluate the qualty of GEBV for canddates to selecton based on dfferent strateges of selectve genotyng of a oulaton undergong selecton, wth dfferent selecton ntenstes
55 Materal and Methods Poulaton/Generatons 5,000 generatons (mutaton-drft equlbrum t : 00 anmals (50 males + 50 females t 5000 : 00 anmals G 0:,500 anmals G :,500 anmals Random matng Factoral matng Drectonal selecton Selecton ntenstes % # ,500
56 Materal and Methods Markers and Genetcs Effects Genome: 0 chromosomes wth 00 cm each Loc: 30 balellc loc (0 markers + 00 QTL n each chromosome M M Q M 3 M 4 M 99 M 00 Q 00 M 0 M 0 Mutaton rates: QTL.5 0-5, Markers QTL effects: Normally dstrbuted Hertablty: h = 0.0, 0.5 and 0.50
57 Materal and Methods Analyss Tranng oulaton: 500 genotyed anmals n G 0 Selectve genotyng strateges: Random, To, Bottom, Etreme, Less related Testng oulaton: Generaton G Model: Bayesan LASSO Performance: Correlatons between GEBV and TBV (accuracy, and Predctve mean square error
58 Materal and Methods G 0 Breedng selecton Genotyng selecton G Testng oulaton:,500 selecton canddates
59 Predcton accuraces (correlatons between GEBV and TBV Correlatons Correlatons Selecton of Anmals n G 0 Selecton of Anmals n G 0
60 Predctve mean squared error (PMSE PMSE PMSE Selecton of Anmals n G 0 Selecton of Anmals n G 0
61 Number and ercentage of concdence anmals
Lecture 8 Genomic Selection
Lecture 8 Genomic Selection Guilherme J. M. Rosa University of Wisconsin-Madison Mixed Models in Quantitative Genetics SISG, Seattle 18 0 Setember 018 OUTLINE Marker Assisted Selection Genomic Selection
More informationRecall that quantitative genetics is based on the extension of Mendelian principles to polygenic traits.
BIOSTT/STT551, Statstcal enetcs II: Quanttatve Trats Wnter 004 Sources of varaton for multlocus trats and Handout Readng: Chapter 5 and 6. Extensons to Multlocus trats Recall that quanttatve genetcs s
More informationMachine Learning. Classification. Theory of Classification and Nonparametric Classifier. Representing data: Hypothesis (classifier) Eric Xing
Machne Learnng 0-70/5 70/5-78, 78, Fall 008 Theory of Classfcaton and Nonarametrc Classfer Erc ng Lecture, Setember 0, 008 Readng: Cha.,5 CB and handouts Classfcaton Reresentng data: M K Hyothess classfer
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models
More informationQuantitative Genetic Models Least Squares Genetic Model. Hardy-Weinberg (1908) Principle. change of allele & genotype frequency over generations
Quanttatve Genetc Models Least Squares Genetc Model Hardy-Wenberg (1908) Prncple partton of effects P = G + E + G E P s phenotypc effect G s genetc effect E s envronmental effect G E s nteracton effect
More informationConfidence intervals for weighted polynomial calibrations
Confdence ntervals for weghted olynomal calbratons Sergey Maltsev, Amersand Ltd., Moscow, Russa; ur Kalambet, Amersand Internatonal, Inc., Beachwood, OH e-mal: kalambet@amersand-ntl.com htt://www.chromandsec.com
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE II LECTURE - GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCE Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 3.
More informationPattern Classification
attern Classfcaton All materals n these sldes were taken from attern Classfcaton nd ed by R. O. Duda,. E. Hart and D. G. Stork, John Wley & Sons, 000 wth the ermsson of the authors and the ublsher Chater
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationMACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression
11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING
More informationEstimation of Genetic and Phenotypic Covariance Functions for Body Weight as Longitudinal Data of SD-II Swine Line
6 Estmaton of Genetc and Phenotypc Covarance Functons for Body Weght as Longtudnal Data of SD-II Swne Lne Wenzhong Lu*, Guoqng Cao, Zhongxao Zhou and Guxan Zhang College of Anmal Scence and Technology,
More informationTwo-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats
tatstcal Models Lecture nalyss of Varance wo-factor model Overall mean Man effect of factor at level Man effect of factor at level Y µ + α + β + γ + ε Eε f (, ( l, Cov( ε, ε ) lmr f (, nteracton effect
More informationChapter 5 Multilevel Models
Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationMIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU
Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationDepartment of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6
Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationStatistical pattern recognition
Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve
More informationLecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding
Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study
More informationEconomics 130. Lecture 4 Simple Linear Regression Continued
Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do
More informationLecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding
Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More information[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact
Multcollnearty multcollnearty Ragnar Frsch (934 perfect exact collnearty multcollnearty K exact λ λ λ K K x+ x+ + x 0 0.. λ, λ, λk 0 0.. x perfect ntercorrelated λ λ λ x+ x+ + KxK + v 0 0.. v 3 y β + β
More informationwhere I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).
11.4.1 Estmaton of Multple Regresson Coeffcents In multple lnear regresson, we essentally solve n equatons for the p unnown parameters. hus n must e equal to or greater than p and n practce n should e
More informationLogistic regression with one predictor. STK4900/ Lecture 7. Program
Logstc regresson wth one redctor STK49/99 - Lecture 7 Program. Logstc regresson wth one redctor 2. Maxmum lkelhood estmaton 3. Logstc regresson wth several redctors 4. Devance and lkelhood rato tests 5.
More informationGenerative classification models
CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationSTAT 3008 Applied Regression Analysis
STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,
More information2016 Wiley. Study Session 2: Ethical and Professional Standards Application
6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear
More informationY = β 0 + β 1 X 1 + β 2 X β k X k + ε
Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty
More informationStatistics for Economics & Business
Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable
More informationLecture 3 Specification
Lecture 3 Specfcaton 1 OLS Estmaton - Assumptons CLM Assumptons (A1) DGP: y = X + s correctly specfed. (A) E[ X] = 0 (A3) Var[ X] = σ I T (A4) X has full column rank rank(x)=k-, where T k. Q: What happens
More informationPolynomial Regression Models
LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationDependent variable for case i with variance σ 2 g i. Number of distinct cases. Number of independent variables
REGRESSION Notaton Ths rocedure erforms multle lnear regresson wth fve methods for entry and removal of varables. It also rovdes extensve analyss of resdual and nfluental cases. Caseweght (CASEWEIGHT)
More informationClassification Bayesian Classifiers
lassfcaton Bayesan lassfers Jeff Howbert Introducton to Machne Learnng Wnter 2014 1 Bayesan classfcaton A robablstc framework for solvng classfcaton roblems. Used where class assgnment s not determnstc,.e.
More informationRegularized Discriminant Analysis for Face Recognition
1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths
More informationNumber of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k
ANOVA Model and Matrx Computatons Notaton The followng notaton s used throughout ths chapter unless otherwse stated: N F CN Y Z j w W Number of cases Number of factors Number of covarates Number of levels
More informationPattern Recognition. Approximating class densities, Bayesian classifier, Errors in Biometric Systems
htt://.cubs.buffalo.edu attern Recognton Aromatng class denstes, Bayesan classfer, Errors n Bometrc Systems B. W. Slverman, Densty estmaton for statstcs and data analyss. London: Chaman and Hall, 986.
More informationData Abstraction Form for population PK, PD publications
Data Abstracton Form for populaton PK/PD publcatons Brendel K. 1*, Dartos C. 2*, Comets E. 1, Lemenuel-Dot A. 3, Laffont C.M. 3, Lavelle C. 4, Grard P. 2, Mentré F. 1 1 INSERM U738, Pars, France 2 EA3738,
More informationChapter 3. Two-Variable Regression Model: The Problem of Estimation
Chapter 3. Two-Varable Regresson Model: The Problem of Estmaton Ordnary Least Squares Method (OLS) Recall that, PRF: Y = β 1 + β X + u Thus, snce PRF s not drectly observable, t s estmated by SRF; that
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased
More informationis the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors
Multple Lnear and Polynomal Regresson wth Statstcal Analyss Gven a set of data of measured (or observed) values of a dependent varable: y versus n ndependent varables x 1, x, x n, multple lnear regresson
More informationOutline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil
Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationJAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger
JAB Chan Long-tal clams development ASTIN - September 2005 B.Verder A. Klnger Outlne Chan Ladder : comments A frst soluton: Munch Chan Ladder JAB Chan Chan Ladder: Comments Black lne: average pad to ncurred
More informationLab 4: Two-level Random Intercept Model
BIO 656 Lab4 009 Lab 4: Two-level Random Intercept Model Data: Peak expratory flow rate (pefr) measured twce, usng two dfferent nstruments, for 17 subjects. (from Chapter 1 of Multlevel and Longtudnal
More informationThe Ordinary Least Squares (OLS) Estimator
The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal
More informationLecture 06 Multinomials and CMR Models for closed populations
WILD 750 - Analyss of Wldlfe Poulatons of 4 Lecture 06 Multnomals and CMR Models for closed oulatons Readngs Wllams, B, JD Nchols, and MJ Conroy 00 Analyss and Management of Anmal Poulatons Academc Press
More informationA New Method for Estimating Overdispersion. David Fletcher and Peter Green Department of Mathematics and Statistics
A New Method for Estmatng Overdsperson Davd Fletcher and Peter Green Department of Mathematcs and Statstcs Byron Morgan Insttute of Mathematcs, Statstcs and Actuaral Scence Unversty of Kent, England Overvew
More informationBIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data
Lab : TWO-LEVEL NORMAL MODELS wth school chldren popularty data Purpose: Introduce basc two-level models for normally dstrbuted responses usng STATA. In partcular, we dscuss Random ntercept models wthout
More informationLecture 6: Introduction to Linear Regression
Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6
More informationLecture 2: Prelude to the big shrink
Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More informationRegularization and Estimation in Regression with Cluster Variables
Oen Journal of Statstcs, 04, 4, 84-85 Publshed Onlne December 04 n ScRes. htt://www.scr.org/ournal/os htt://d.do.org/0.46/os.04.40077 Regularzaton and Estmaton n Regresson wth Cluster Varables Qngzhao
More informationChapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the
More informationTHE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationChapter 12 Analysis of Covariance
Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationLogistic Regression Maximum Likelihood Estimation
Harvard-MIT Dvson of Health Scences and Technology HST.951J: Medcal Decson Support, Fall 2005 Instructors: Professor Lucla Ohno-Machado and Professor Staal Vnterbo 6.873/HST.951 Medcal Decson Support Fall
More informationPattern Classification (II) 杜俊
attern lassfcaton II 杜俊 junu@ustc.eu.cn Revew roalty & Statstcs Bayes theorem Ranom varales: screte vs. contnuous roalty struton: DF an DF Statstcs: mean, varance, moment arameter estmaton: MLE Informaton
More information8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore
8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø
More informationHere is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)
Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,
More informationConservative Surrogate Model using Weighted Kriging Variance for Sampling-based RBDO
9 th World Congress on Structural and Multdsclnary Otmzaton June 13-17, 011, Shzuoka, Jaan Conservatve Surrogate Model usng Weghted Krgng Varance for Samlng-based RBDO Lang Zhao 1, K.K. Cho, Ikn Lee 3,
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationIntroduction to Analysis of Variance (ANOVA) Part 1
Introducton to Analss of Varance (ANOVA) Part 1 Sngle factor The logc of Analss of Varance Is the varance explaned b the model >> than the resdual varance In regresson models Varance explaned b regresson
More informationBasic Business Statistics, 10/e
Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson
More informationCorrelation and Regression. Correlation 9.1. Correlation. Chapter 9
Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,
More informationHomework 9 STAT 530/J530 November 22 nd, 2005
Homework 9 STAT 530/J530 November 22 nd, 2005 Instructor: Bran Habng 1) Dstrbuton Q-Q plot Boxplot Heavy Taled Lght Taled Normal Skewed Rght Department of Statstcs LeConte 203 ch-square dstrbuton, Telephone:
More informationReduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor
Reduced sldes Introducton to Analss of Varance (ANOVA) Part 1 Sngle factor 1 The logc of Analss of Varance Is the varance explaned b the model >> than the resdual varance In regresson models Varance explaned
More informatione i is a random error
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown
More informationHowever, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values
Fall 007 Soluton to Mdterm Examnaton STAT 7 Dr. Goel. [0 ponts] For the general lnear model = X + ε, wth uncorrelated errors havng mean zero and varance σ, suppose that the desgn matrx X s not necessarly
More informationOn New Selection Procedures for Unequal Probability Sampling
Int. J. Oen Problems Comt. Math., Vol. 4, o. 1, March 011 ISS 1998-66; Coyrght ICSRS Publcaton, 011 www.-csrs.org On ew Selecton Procedures for Unequal Probablty Samlng Muhammad Qaser Shahbaz, Saman Shahbaz
More informationComparison of Outlier Detection Methods in Crossover Design Bioequivalence Studies
Journal of Pharmacy and Nutrton Scences, 01,, 16-170 16 Comarson of Outler Detecton Methods n Crossover Desgn Boequvalence Studes A. Rasheed 1,*, T. Ahmad,# and J.S. Sddq,# 1 Deartment of Research, Dow
More informationInterval Estimation in the Classical Normal Linear Regression Model. 1. Introduction
ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model
More information7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA
Sngle classfcaton analyss of varance (ANOVA) When to use ANOVA ANOVA models and parttonng sums of squares ANOVA: hypothess testng ANOVA: assumptons A non-parametrc alternatve: Kruskal-Walls ANOVA Power
More informationx i1 =1 for all i (the constant ).
Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by
More informationStatistics for Business and Economics
Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear
More informationProfessor Chris Murray. Midterm Exam
Econ 7 Econometrcs Sprng 4 Professor Chrs Murray McElhnney D cjmurray@uh.edu Mdterm Exam Wrte your answers on one sde of the blank whte paper that I have gven you.. Do not wrte your answers on ths exam.
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experments- MODULE LECTURE - 6 EXPERMENTAL DESGN MODELS Dr. Shalabh Department of Mathematcs and Statstcs ndan nsttute of Technology Kanpur Two-way classfcaton wth nteractons
More informationOutline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.
Outlne 3. Multple Regresson Analyss: Estmaton I. Motvaton II. Mechancs and Interpretaton of OLS Read Wooldrdge (013), Chapter 3. III. Expected Values of the OLS IV. Varances of the OLS V. The Gauss Markov
More informationChapter 9: Statistical Inference and the Relationship between Two Variables
Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,
More informationChapter 15 - Multiple Regression
Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term
More informationGenCB 511 Coarse Notes Population Genetics NONRANDOM MATING & GENETIC DRIFT
NONRANDOM MATING & GENETIC DRIFT NONRANDOM MATING/INBREEDING READING: Hartl & Clark,. 111-159 Wll dstngush two tyes of nonrandom matng: (1) Assortatve matng: matng between ndvduals wth smlar henotyes or
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 14 Multple Regresson Models 1999 Prentce-Hall, Inc. Chap. 14-1 Chapter Topcs The Multple Regresson Model Contrbuton of Indvdual Independent Varables
More informationMD. LUTFOR RAHMAN 1 AND KALIPADA SEN 2 Abstract
ISSN 058-71 Bangladesh J. Agrl. Res. 34(3) : 395-401, September 009 PROBLEMS OF USUAL EIGHTED ANALYSIS OF VARIANCE (ANOVA) IN RANDOMIZED BLOCK DESIGN (RBD) ITH MORE THAN ONE OBSERVATIONS PER CELL HEN ERROR
More informationReview: Fit a line to N data points
Revew: Ft a lne to data ponts Correlated parameters: L y = a x + b Orthogonal parameters: J y = a (x ˆ x + b For ntercept b, set a=0 and fnd b by optmal average: ˆ b = y, Var[ b ˆ ] = For slope a, set
More informationMachine Learning for Signal Processing Linear Gaussian Models
Machne Learnng for Sgnal Processng Lnear Gaussan Models Class 7. 30 Oct 204 Instructor: Bhksha Raj 755/8797 Recap: MAP stmators MAP (Mamum A Posteror: Fnd a best guess for (statstcall, gven knon = argma
More informationsince [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation
Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson
More informationChapter 6. Supplemental Text Material
Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s
More information