CE 651 Transportaton Economcs Charsma Choudhury Lecture 3-4 Analyss of Demand Contnuous vs. Dscrete Goods Contnuous Goods Dscrete Goods x auto 1 Indfference u curves 3 u u 1 x 1 0 1 bus
Outlne Data Modelng Prncples Assumptons Estmates Statstcal Tests Potental data problems 3 Data Cross-secton Actvtes of ndvdual persons, frms or other unts at sngle tme One observaton/ ndvdual Tme seres Movement of a varable over tme Annual, quarterly, monthly, weekly observatons etc. Mostly used for natonal or regonal level aggregaton of the observatons Pooled/panel Combnaton of tme seres and cross-secton Behavor of ndvdual persons, frms or other unts over tme 4
Examples Cross-sectonal data ID VMT o of cars n HH HH Income o of HH members o of Chldren n HH TAZ 1 1000 >80k 4 101 100 30-50k 3 1 101 3 600 1 30-50k 0 104 Vehcle Mles Travelled n 008 5 Examples Tme seres data Vehcle Mles Travelled n between 006-008 Year Avg Fuel Prce/L Average VMT/ person Average car ownershp 000 0 500 0.01 001 575 0.0 00 5 0.04 006 45 800 0.06 007 50 900 0.06 008 77 1000 0.06 6
Examples Pooled/Panel data ID 1 Year Avg Fuel Prce/L VMT Vehcle Mles Travelled n between 006-008 o of cars n HH HH Income o of HH members o of Chldren n HH 006 30 800 >80k 4 101 007 50 900 >80k 4 101 008 77 1000 >80k 4 101 006 30 1000 30-50k 3 1 101 007 50 1500 30-50k 3 1 101 008 77 100 30-50k 3 1 101 TAZ 7 Examples Pseudo panel data ID Vehcle Mles Travelled n between 006-008 Year o of Avg Fuel Prce/L VMT o of cars HH o of HH Chldren n HH Income members n HH TAZ 1 006 30 800 1 >80k 4 1 101 007 50 900 50-80k 3 104 3 008 77 1000 >80k 4 101 4 006 30 1000 3 30-50k 3 108 5 007 50 1500 1 30-50k 1 101 6 008 77 100 1 50-80k 3 1 101 8
Modelng Prncples Hypothess Example VMT= f (fuel cost, no of cars, hh ncome, hh sze) Lnear relatonshp Example VMT= α + βcost * cost + βcar * caro + βnc * hhinc + βsze * hhsze on-lnear relatonshp βcos t * cost VMT = α + βcar * caro + 1 + β * hhinc In ths course we wll deal wth lnear relatonshps only In Regresson analyss, we estmate α and β s that best ft the observed data usng estmators nc 9 Estmators Our nterest Populaton Avalable Sample/samples from populaton Sample nformaton to obtan best possble estmates Estmator Rule that gves a reasonable estmate for each and every possble sample Estmators are rules Estmates are numbers produced by the estmator Desrable propertes Unbasedness Effcency Consstency (only for large sample) 10
Desrable Propertes We want our estmators to be Unbased Expected value of estmator close to true mean Bas = E β * ( ) βpopulaton Effcent For a gven sample sze, varance s smaller than any other unbased estmator Hgher effcency ndcates hgher relance on results * Consstent As ncreases β β populaton Ths assumpton s requred when we do statstcal tests (e.g. t-test) 11 Examples of Estmators Least/Mnmum error Mn ( Y Y ) = 1 Least/Mnmum absolute error Mn ( Y Y ) = 1 Ordnary Least square (OLS) Mn ( Y Y ) = 1 Weghted least square (WLS) Mn w ( Y Y ) = 1 1
Two Varable Lnear Regresson Model Model Y = α + βx + ε X = non stochastc ε = stochastc random term ( often follows certan dstrbutons) 13 ε Error ( ) Varables cannot provde perfect explanatons Errors are thngs that nfluence Y other than X Reasons Smplfcaton of realty e.g. VMT=f (no of cars, hh ncome, hh sze, hh chldren, locaton) Omtted varables ndvdual tastes, educaton, lfestyle patterns and many more Measurement errors Prvacy ssues Poor record keepng etc. 14
Error ( ε ) Predcton Error ε = Y Y Y * * = Predcted dependent varables= α + β X Sum squared error (SSE) ε = ( Y Y ) * In OLS, we mnmze SSE 15 Two Varable Lnear Regresson Model Model Y = α + βx + ε X = non stochastc ε = stochastc random term ( often follows certan dstrbutons) Soluton β = α XY X Y = β X If Y vares a lot when X vares lttle, wll be bg. In other words, s the magntude of nfluence of x on y β β 16
Statstcal Sgnfcance How dependable are the estmates? How sgnfcant s X n explanng Y? * If there s a hgh probablty that β s not 0, then β s statstcally sgnfcant The smaller the standard errors (varances) are relatve to the coeffcents, the more confdence we have n the estmates How to test? Use t-stats/ t-test t-stat = β * β * std error of β Compare wth t crtcal (at 95% or 90% level of confdence) at (-k) dof (=Obs number, k= number of estmated parameters) > t crtcal statstcally sgnfcant 17 Goodness-of-Ft How well the model fts the data Measure ε R 1 Y = 18
Multvarate Lnear Regresson Model Y = α + β X + β X + β X +... + ε 1 1 3 3 In matrx notaton Y=βX X = [1 X X X...] OLS Soluton 1 3 1 β = ( X ' X) ( X ' Y) 19 Goodness-of-Ft R always ncreases as we add new varables Measure R whch accounts for k (number of estmated parameters) Model wth hgher R-bar sqr. has better goodness-of-ft n absolute terms 0
Example Chcago Trp Generaton Dependent varable average trps per occuped dwellng unt Independent varables average car ownershp average household sze three zonal socal ndces 1 Assgnment Varatons you can try Add other varables Use nteracton terms Use log on varables Pecewse lnear formulaton Evaluaton crtera Correct sgns Improvement n goodness-of-ft t-test
Assumptons of Classcal LR Model 1. Relatonshp between X and Y lnear. X non-stochastc and no exact lnear relatonshp exsts between two or more ndependent varables 3. Error has zero expected value (cancel out) E( ε ) = 0 4. Error has constant varance for all observatons E( ε ) = σ 5. o correlaton among errors E( ε ε ) = 0, for all j j 3 Gauss Markov Theorem If 1-5 s fulflled OLS s BLUE Best Lnear Unbased Estmator 4
Volaton 1 Collnearty Types Perfect correlaton Other hgh nterdependence multcollnearty Examples e.g. GPA=f(X1,X,X3, X4,X5) X1= parents educaton level X= average hours of study / day X3= average hours of study/ week X4= parents ncome X5= school X and X3 perfectly collnear X1 and X4 can be multcollnear 5 Volaton 1Collnearty (cont) Effect Perfect Cannot be estmated Multcollnear Dffcult to nterpret Affects statstcal sgnfcance Soluton Drop one varable Cauton May result bas 6
Volaton Heteroscedastcty Homoscedastc= constant varance Heteroscedastc = varance not constant Example Large frm bgger errors Larger TAZ bgger errors Effect Estmators unbased but neffcent Soluton Weghted least square (WLS) 7 Volatons 3 Seral correlaton Both cross secton and tme seres Can be postve or negatve e.g. Postve error ncorrect mleage readng egatve error mleage data taken n Jan 009 nstead of Dec 008 ; overestmaton of 008 VMT, underestmaton of 009 VMT Effect Estmators unbased but neffcent Soluton Pras-Wnsten, Cochrane-Orcutt, Durbn s Method 8