Basic R Programming: Exercises

Size: px
Start display at page:

Download "Basic R Programming: Exercises"

Transcription

1 Basc R Programmng: Exercses RProgrammng John Fox ICPSR, Summer Logstc Regresson: Iterated weghted least squares (IWLS) s a standard method of fttng generalzed lnear models to data. As descrbed n Secton 5.5 of An R and S-PLUS Companon to Appled Regresson (Fox, 2002), the IWLS algorthm appled to bnomal logstc regresson proceeds as follows: (a) Set the regresson coeffcents to ntal values, such as β (0) = 0 (where the superscrpt 0 ndcates start values). (b) At each teraton t calculate the current ftted probabltes μ, varance-functon values ν, workng-response values z, andweghtsw: μ (t) = [1+exp( η (t) )] 1 v (t) = μ (t) (1 μ (t) ) z (t) = η (t) +(y μ (t) )/v (t) w (t) = n v Here, n represents the bnomal denomnator for the th observaton; for bnary data, allofthen are 1. (c) Regress the workng response on the predctors by weghted least squares, mnmzng the weghted resdual sum of squares nx =1 w (t) (z (t) x 0 β) 2 where x 0 s the th row of the model matrx. (d) Repeat steps 2 and 3 untl the regresson coeffcents stablze at the maxmum-lkelhood estmator β. b (e) Calculate the estmated asymptotc covarance matrx of the coeffcents as bv( b β)=(x 0 WX) 1 where W = dag{w } s the dagonal matrx of weghts from the last teraton and X s the model matrx. Problem: Program ths method n R. The functon that you defne should take (at least) three arguments: The model matrx X; the response vector of observed proportons y; and 1

2 the vector of bnomal denomnators n. Isuggestthatyouletn default to a vector of 1s (.e., for bnary data, where y conssts of 0s and1s), and that you attach a column of 1s tothe model matrx for the regresson constant so that the user does not have to do ths when the functon s called. Programmng hnts: Adapt the structure of the example developed n Secton of Wrtng Programs (Fox and Wesberg, draft), but note that ths example s for bnary logstc regresson, whle the current exercse s to program the more general bnomal logt model. Use the lsft functon to get the weghted-least-squares ft, callng the functon as lsft(x, z, w, ntercept=false), wherex s the model matrx; z s the current workng response; and w s the current weght vector. The argument ntercept=false s needed because the model matrx already has a column of 1s. The functon lsft returns a lst, wth element $coef contanng the regresson coeffcents. See?lsft for detals. One trcky pont s that lsft requres that the weghts (w) beavector, whle your calculaton wll probably produce a one-column matrx of weghts. You can coerce the weghts to a vector usng the functon as.vector. Return a lst wth the maxmum-lkelhood estmates of the coeffcents, the covarance matrx of the coeffcents, and the number of teratons requred. You can test your functon on the Mroz data n the car package, beng careful to make all of the varables numerc. You mght also try fttng a bnomal (as opposed to bnary) logt model. 2. A Challengng Problem Ordered Logt and Probt Models: Ordered logt and probt models are popular regresson models for ordnal response varables; the ordered logt model s also called the proportonal-odds model (see below for an explanaton). The followng descrpton s adapted from Fox, Appled Regresson Analyss and Generalzed Lnear Models, Second Edton (2008, Ch. 14): Imagne that there s a latent (.e., unobservable) varable ξ that s a lnear functon of X s plus a random error: ξ = α + β 1 X β k X k + ε The latent response ξ s dssected by m 1 thresholds (.e., boundares) nto m regons. Denotng the thresholds by α 1 <α 2 < <α m 1, and the resultng response by Y,we observe 1 f ξ α 1 2 f α 1 <ξ α 2 Y =. (1) m 1 f α m 2 <ξ α m 1 m f α m 1 <ξ The thresholds, regons, and correspondng values of ξ and Y are represented graphcally n the followng fgure. Notce that the thresholds are not n general unformly spaced. 2

3 1 2 m 1 m Y ξ α 1 α 2 α m 2 α m 1 Usng Equaton 1, we can determne the cumulatve probablty dstrbuton of Y : Pr(Y j) =Pr(ξ α j ) =Pr(α + β 1 X β k X k + ε α j ) =Pr(ε α j α β 1 X 1 β k X k ) If the errors ε are ndependently dstrbuted accordng to the standard normal dstrbuton, then we obtan the ordered probt model. If the errors follow the smlar logstc dstrbuton, then we get the ordered logt model. In the latter event, Equvalently, Pr(Y j) logt [Pr(Y j)] = log e Pr(Y >j) = α j α β 1 X 1 β k X k logt [Pr(Y >j)] = log e Pr(Y >j) Pr(Y j) (2) =(α α j )+β 1 X β k X k for j =1, 2,...,m 1. The logts n Equaton 2 are for cumulatve categores at each pont contrastng categores above category j wth category j and below. The slopes for each of these regresson equatons are dentcal; the equatons dffer only n ther ntercepts. Put another way, for a fxed set of X s, any two dfferent cumulatve log-odds (.e., logts) say, at categores j and j 0 dffer only by the constant (α j α j 0). The odds, therefore, are proportonal to one another; that s, odds j odds j 0 =exp logt j logt j 0 =exp(αj α j 0)= eα j e α j 0 where, for example, odds j =Pr(Y >j) and logt j =logt [Pr(Y >j)]. For ths reason, Equaton 2 s called the proportonal-odds logt model. There are (k +1)+(m 1) = k + m parameters to estmate n the proportonal-odds model, ncludng the regresson coeffcents α, β 1,...,β k and the category thresholds α 1,...,α m 1. Note, however, that there s an extra parameter n the regresson equatons (Equaton 2), because each equaton has ts own constant, α j, along wth the common constant α. A smple soluton s to set α =0(and to absorb the negatve sgn nto α j ), producng logt [Pr(Y >j)] = α j + β 1 X β k X k 3

4 and thus Pr(Y >j) = Λ (α j + β 1 X β k X k ), j =1,...,m 1 (3) = Λ α j + x 0 β (4) where Λ ( ) s the cumulatve logstc dstrbuton. In ths parametrzaton, the ntercepts α j are the negatves of the category thresholds. The ordered probt model s smlar wth Pr(Y >j) = Φ (α j + β 1 X β k X k ), j =1,...,m 1 (5) = Φ α j + x 0 β (6) where Φ ( ) s the cumulatve normal dstrbuton. The log-lkelhood under both the ordered logt and ordered probt model takes the followng form: nx log e L(α, β) = log e π w 1 1 πw 2 2 πw m m =1 where α s an m 1 1 vector contanng all of the regresson constants and β s the k 1 vector contanng the other regresson coeffcents; π j =Pr(Y = j) (.e., the probablty under the model that ndvdual s n response category j); and the w j are ndcator varables equal to 1 f ndvdual s observed n category j and 0 otherwse. Thus, for each ndvdual, only one of the w j s equal to 1 and only the correspondng π j contrbutes to the lkelhood. Note that for ether the ordered logt model or the ordered probt model, the ndvdual-category probabltes can be computed as dfferences between adjacent cumulatve probabltes from Equaton 3 or 5 (whch are functons of the parameters): π 1 = 1 Pr(Y > 1) π 2 = Pr(Y > 1) Pr(Y > 2). π m = Pr(Y >m 1) Problem: Program the ordered-logt model or the ordered-probt model (or both). The functon that you defne should take (at least) two arguments: The model matrx X and the response vector y, whch should be a factor or ordered factor; I suggest that you attach a column of 1s to the model matrx for the regresson constants so that the user does not have to do ths when the functon s called; the ordered logt and probt models always have constants. Your functon can nclude an argument to ndcate whch model logt or probt stobeft. Programmng hnts: The parameters consst of the ntercepts and the other regresson coeffcents, say a and b. Although there are cleverer ways to proceed, you can set b to a vector of zeroes to start, and compute start values for a from the margnal dstrbuton of the response; e.g., marg.p <- rev(cumsum(rev(table(y)/n)))[-1] a <- log(marg.p/(1 - marg.p)) Here y s the response vector and n s the number of observatons. 4

5 If you re fttng the ordered logt model, use the cumulatve logstc dstrbuton functon plogs(); fyou refttng the ordered probt model, use the cumulatve normal dstrbuton functon pnorm(). Use optm() to maxmze the lkelhood, treatng the lreg2() functon n Secton of Wrtng Programs (Fox and Wesberg, draft) as a model, but notng that for the ordered logt and probt models, I have shown only the log-lkelhood and not the gradent. Return a lst wth the maxmum-lkelhood estmates of the coeffcents, ncludng the ntercepts or thresholds (negatve of the ntercepts); the covarance matrx of the coeffcents (obtaned from the nverse-hessan returned by optm()), the resdual devance for the model (.e., mnus twce the maxmzed log-lkelhood), and an ndcaton of whether or not the computaton converged. The ordered logt and probt models may be ft bythepolr() functon n the MASS package (one of the standard R packages). You can use polr() to verfy that your functon works properly. To test your program, you can use the WVS dataset n the effects package. For testng purposes, use a smple addtve model rather than the model wth nteractons gven n?wvs. 3. General Cumulatve Logt and Probt Models: The ordered logt and probt models of the prevous problem make the strong assumpton that all m 1 cumulatve probabltes can be modeled wth the same regresson coeffcents, except for dfferent ntercepts. More general versons of these models permt dfferent regresson coeffcents: or Pr(Y >j)=λ α j + β 1j X β kj X k, j =1,...,m 1 Pr(Y >j)=φ α j + β 1j X β kj X k, j =1,...,m 1 Program one or the other (or both) of these models. For your example regresson, use a lkelhood-rato test to compare the more general cumulatve logt or probt model to the more restrctve ordered logt or probt model of the precedng problem. Ths test checks the assumpton of equal slopes. The cumulatve logt and probt models (along wth the ordered logt and probt models) can be ft bythevglm() functon n the VGAM package. 4. Numercal Lnear Algebra: A matrx s sad to be n reduced row-echelon form when t satsfes the followng crtera: (a) All of ts nonzero rows (f any) precede all of ts zero rows (f any). (b) The frst entry (from left to rght) called the leadng entry n each nonzero row s 1. (c) The leadng entry n each nonzero row after the frststotherghtoftheleadngentry n the prevous row. (d) All other entres are 0 n a column contanng a leadng entry. A matrx can be put nto reduced row echelon form by a sequence of elementary row operatons, whchareofthreetypes: (a) Multply each entry n a row by a nonzero constant. 5

6 (b) Add a multple of one row to another, replacng the other row. (c) Exchange two rows. Gaussan elmnaton s a method for transformng a matrx to reduced row-echelon form by elementary row operatons. Startng at the frst row and frst column of the matrx, and proceedng down and to the rght: (a) If there s a 0 n the current row and column (called the pvot), f possble exchange for a lower row to brng a nonzero element nto the pvot poston; f there s no nonzero pvot avalable, move to the rght and repeat ths step. If there are no nonzero elements anywhere to the rght (and below), then stop. (b) Dvde the current row by the pvot, puttng a 1 n the pvot poston. (c) Proceedng through the other rows of the matrx, multply the pvot row by the element n the pvot column n another row, subtractng the result from the other row; ths zeroes out the pvot column. Consder the followng example: Dvde row 1 by -2: Subtract 4 row 1 from row 2: Subtract 6 row 1 from row 3: Multply row 2 by -1: Subtract 0.5 row 2 from row 1: Add 2 row 2 to row 3:

7 The matrx s now n reduced row-echelon form. The rank of a matrx s the number of nonzero rows n ts reduced row-echelon form, and so the matrx n ths example s of rank 2. Problem: Wrte an R functon to calculate the reduced row-echelon form of a matrx by elmnaton. Programmng hnts: When you do floatng-pont arthmetc on a computer, there are almost always roundng errors. One consequence s that you cannot rely on a number beng exactly equal to a value such as 0. When you test that an element, say x, s0,therefore,youshoulddo so wthn a tolerance e.g., x < The computatons tend to be more accurate f the absolute values of the pvots are as large as possble. Consequently, you can exchange a row for a lower one to get a larger pvot even f the element n the pvot poston s nonzero. 5. A less dffcult problem: Wrte a functon to compute runnng medans. Runnng medans are a smple smoothng method usually appled to tme-seres. For example, for the numbers 7, 5, 2, 8, 5, 5, 9, 4, 7, 8, the runnng medans of length 3 are 5, 5, 5, 5, 5, 5, 7, 7. The frst runnng medan s the medan of the three numbers 7, 5, and 2; the second runnng medan s the medan of 5, 2, and 8; and so on. Your functon should take two arguments: the data (say, x), and the number of observatons for each medan (say, length). Notce that there are fewer runnng medans than observatons. How many fewer? 6. Smulaton: Develop a smulaton llustratng the central lmt theorem for the mean: Almost regardless of the populaton dstrbuton of X, the mean X of repeated samples of sze n drawn from the populaton s approxmately normally dstrbuted,wth mean E(X) =E(X) =μ, and varance V (X) =V (X)/n = σ 2 /n, and wth the approxmaton mprovng as the sample sze grows. Sample from a hghly skewed dstrbuton, such as the exponental dstrbuton wth a small rate parameter λ (e.g., λ =1); use several dfferent sample szes, such as 1, 2, 5, 25, and 100, and draw many samples of each sze, comparng the observed dstrbuton of sample means wth the approxmatng normal dstrbuton. Exponental random varables may be generated n R usng the rexp() functon. Note that the mean of an exponental random varable s 1/λ and ts varance s 1/λ 2. 7

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Chapter 14: Logit and Probit Models for Categorical Response Variables

Chapter 14: Logit and Probit Models for Categorical Response Variables Chapter 4: Logt and Probt Models for Categorcal Response Varables Sect 4. Models for Dchotomous Data We wll dscuss only ths secton of Chap 4, whch s manly about Logstc Regresson, a specal case of the famly

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients ECON 5 -- NOE 15 Margnal Effects n Probt Models: Interpretaton and estng hs note ntroduces you to the two types of margnal effects n probt models: margnal ndex effects, and margnal probablty effects. It

More information

Introduction to the R Statistical Computing Environment R Programming

Introduction to the R Statistical Computing Environment R Programming Introducton to the R Statstcal Computng Envronment R Programmng John Fox McMaster Unversty ICPSR 2018 John Fox (McMaster Unversty) R Programmng ICPSR 2018 1 / 14 Programmng Bascs Topcs Functon defnton

More information

Lab 4: Two-level Random Intercept Model

Lab 4: Two-level Random Intercept Model BIO 656 Lab4 009 Lab 4: Two-level Random Intercept Model Data: Peak expratory flow rate (pefr) measured twce, usng two dfferent nstruments, for 17 subjects. (from Chapter 1 of Multlevel and Longtudnal

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

9. Binary Dependent Variables

9. Binary Dependent Variables 9. Bnar Dependent Varables 9. Homogeneous models Log, prob models Inference Tax preparers 9.2 Random effects models 9.3 Fxed effects models 9.4 Margnal models and GEE Appendx 9A - Lkelhood calculatons

More information

Chapter 20 Duration Analysis

Chapter 20 Duration Analysis Chapter 20 Duraton Analyss Duraton: tme elapsed untl a certan event occurs (weeks unemployed, months spent on welfare). Survval analyss: duraton of nterest s survval tme of a subject, begn n an ntal state

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

1 Binary Response Models

1 Binary Response Models Bnary and Ordered Multnomal Response Models Dscrete qualtatve response models deal wth dscrete dependent varables. bnary: yes/no, partcpaton/non-partcpaton lnear probablty model LPM, probt or logt models

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems Chapter. Ordnar Dfferental Equaton Boundar Value (BV) Problems In ths chapter we wll learn how to solve ODE boundar value problem. BV ODE s usuall gven wth x beng the ndependent space varable. p( x) q(

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data Lab : TWO-LEVEL NORMAL MODELS wth school chldren popularty data Purpose: Introduce basc two-level models for normally dstrbuted responses usng STATA. In partcular, we dscuss Random ntercept models wthout

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Dummy variables in multiple variable regression model

Dummy variables in multiple variable regression model WESS Econometrcs (Handout ) Dummy varables n multple varable regresson model. Addtve dummy varables In the prevous handout we consdered the followng regresson model: y x 2x2 k xk,, 2,, n and we nterpreted

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

Hydrological statistics. Hydrological statistics and extremes

Hydrological statistics. Hydrological statistics and extremes 5--0 Stochastc Hydrology Hydrologcal statstcs and extremes Marc F.P. Berkens Professor of Hydrology Faculty of Geoscences Hydrologcal statstcs Mostly concernes wth the statstcal analyss of hydrologcal

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) June 7, 016 15:30 Frst famly name: Name: DNI/ID: Moble: Second famly Name: GECO/GADE: Instructor: E-mal: Queston 1 A B C Blank Queston A B C Blank Queston

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

8.6 The Complex Number System

8.6 The Complex Number System 8.6 The Complex Number System Earler n the chapter, we mentoned that we cannot have a negatve under a square root, snce the square of any postve or negatve number s always postve. In ths secton we want

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS) Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Statistics MINITAB - Lab 2

Statistics MINITAB - Lab 2 Statstcs 20080 MINITAB - Lab 2 1. Smple Lnear Regresson In smple lnear regresson we attempt to model a lnear relatonshp between two varables wth a straght lne and make statstcal nferences concernng that

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Randomness and Computation

Randomness and Computation Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Laboratory 3: Method of Least Squares

Laboratory 3: Method of Least Squares Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Differentiating Gaussian Processes

Differentiating Gaussian Processes Dfferentatng Gaussan Processes Andrew McHutchon Aprl 17, 013 1 Frst Order Dervatve of the Posteror Mean The posteror mean of a GP s gven by, f = x, X KX, X 1 y x, X α 1 Only the x, X term depends on the

More information

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k ANOVA Model and Matrx Computatons Notaton The followng notaton s used throughout ths chapter unless otherwse stated: N F CN Y Z j w W Number of cases Number of factors Number of covarates Number of levels

More information

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material Natural Images, Gaussan Mxtures and Dead Leaves Supplementary Materal Danel Zoran Interdscplnary Center for Neural Computaton Hebrew Unversty of Jerusalem Israel http://www.cs.huj.ac.l/ danez Yar Wess

More information

Properties of Least Squares

Properties of Least Squares Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures

More information

Formulas for the Determinant

Formulas for the Determinant page 224 224 CHAPTER 3 Determnants e t te t e 2t 38 A = e t 2te t e 2t e t te t 2e 2t 39 If 123 A = 345, 456 compute the matrx product A adj(a) What can you conclude about det(a)? For Problems 40 43, use

More information

Logistic Regression Maximum Likelihood Estimation

Logistic Regression Maximum Likelihood Estimation Harvard-MIT Dvson of Health Scences and Technology HST.951J: Medcal Decson Support, Fall 2005 Instructors: Professor Lucla Ohno-Machado and Professor Staal Vnterbo 6.873/HST.951 Medcal Decson Support Fall

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

Laboratory 1c: Method of Least Squares

Laboratory 1c: Method of Least Squares Lab 1c, Least Squares Laboratory 1c: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrcs of Panel Data Jakub Mućk Meetng # 8 Jakub Mućk Econometrcs of Panel Data Meetng # 8 1 / 17 Outlne 1 Heterogenety n the slope coeffcents 2 Seemngly Unrelated Regresson (SUR) 3 Swamy s random

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Advanced Statistical Methods: Beyond Linear Regression

Advanced Statistical Methods: Beyond Linear Regression Advanced Statstcal Methods: Beyond Lnear Regresson John R. Stevens Utah State Unversty Notes 2. Statstcal Methods I Mathematcs Educators Workshop 28 March 2009 1 http://www.stat.usu.edu/~rstevens/pcm 2

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Scientific Question Determine whether the breastfeeding of Nepalese children varies with child age and/or sex of child.

Scientific Question Determine whether the breastfeeding of Nepalese children varies with child age and/or sex of child. Longtudnal Logstc Regresson: Breastfeedng of Nepalese Chldren PART II GEE models (margnal, populaton average) covered last lab Random Intercept models (subject specfc) Transton models Scentfc Queston Determne

More information