AN INTRODUCTION TO APPLIED ECONOMETRICS (Lecture notes) Jean-Pierre Laffargue University of Paris 1, PSE and CEPREMAP

Similar documents
Department of Economics University of Toronto

Econ107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6)

January Examinations 2012

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

UNIVERSITAT AUTÒNOMA DE BARCELONA MARCH 2017 EXAMINATION

Graduate Macroeconomics 2 Problem set 5. - Solutions

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

( ) () we define the interaction representation by the unitary transformation () = ()

Solution in semi infinite diffusion couples (error function analysis)

RELATIONSHIP BETWEEN VOLATILITY AND TRADING VOLUME: THE CASE OF HSI STOCK RETURNS DATA

New M-Estimator Objective Function. in Simultaneous Equations Model. (A Comparative Study)

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

Linear Response Theory: The connection between QFT and experiments

Variants of Pegasos. December 11, 2009

Lecture 6: Learning for Control (Generalised Linear Regression)

2. SPATIALLY LAGGED DEPENDENT VARIABLES

Fall 2009 Social Sciences 7418 University of Wisconsin-Madison. Problem Set 2 Answers (4) (6) di = D (10)

GENERATING CERTAIN QUINTIC IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS. Youngwoo Ahn and Kitae Kim

Lecture VI Regression

F-Tests and Analysis of Variance (ANOVA) in the Simple Linear Regression Model. 1. Introduction

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

TSS = SST + SSE An orthogonal partition of the total SS

Data Collection Definitions of Variables - Conceptualize vs Operationalize Sample Selection Criteria Source of Data Consistency of Data

Analysis And Evaluation of Econometric Time Series Models: Dynamic Transfer Function Approach

Robustness Experiments with Two Variance Components

Let s treat the problem of the response of a system to an applied external force. Again,

5th International Conference on Advanced Design and Manufacturing Engineering (ICADME 2015)

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

THEORETICAL AUTOCORRELATIONS. ) if often denoted by γ. Note that

Time-interval analysis of β decay. V. Horvat and J. C. Hardy

Robust and Accurate Cancer Classification with Gene Expression Profiling

On One Analytic Method of. Constructing Program Controls

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.

Math 128b Project. Jude Yuen

Mechanics Physics 151

Notes on the stability of dynamic systems and the use of Eigen Values.

Mechanics Physics 151

Fall 2010 Graduate Course on Dynamic Learning

CS286.2 Lecture 14: Quantum de Finetti Theorems II

Advanced time-series analysis (University of Lund, Economic History Department)

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.

PhD/MA Econometrics Examination. January, 2019

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s

Lecture 18: The Laplace Transform (See Sections and 14.7 in Boas)

Time Scale Evaluation of Economic Forecasts

Mechanics Physics 151

Approximate Analytic Solution of (2+1) - Dimensional Zakharov-Kuznetsov(Zk) Equations Using Homotopy

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Comparison of Supervised & Unsupervised Learning in βs Estimation between Stocks and the S&P500

[Link to MIT-Lab 6P.1 goes here.] After completing the lab, fill in the following blanks: Numerical. Simulation s Calculations

( ) [ ] MAP Decision Rule

Panel Data Regression Models

II. Light is a Ray (Geometrical Optics)

NPTEL Project. Econometric Modelling. Module23: Granger Causality Test. Lecture35: Granger Causality Test. Vinod Gupta School of Management

Economics 120C Final Examination Spring Quarter June 11 th, 2009 Version A

Midterm Exam. Thursday, April hour, 15 minutes

Comparison of Differences between Power Means 1

Bayesian Inference of the GARCH model with Rational Errors

CHAPTER 5: MULTIVARIATE METHODS

Survival Analysis and Reliability. A Note on the Mean Residual Life Function of a Parallel System

Chapter Lagrangian Interpolation

CHAPTER 10: LINEAR DISCRIMINATION

Lecture 11 SVM cont

M. Y. Adamu Mathematical Sciences Programme, AbubakarTafawaBalewa University, Bauchi, Nigeria

Hidden Markov Models Following a lecture by Andrew W. Moore Carnegie Mellon University

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

ACEI working paper series RETRANSFORMATION BIAS IN THE ADJACENT ART PRICE INDEX

Volatility Interpolation

10. A.C CIRCUITS. Theoretically current grows to maximum value after infinite time. But practically it grows to maximum after 5τ. Decay of current :

Clustering (Bishop ch 9)

Lecture Notes 4. Univariate Forecasting and the Time Series Properties of Dynamic Economic Models

Density Matrix Description of NMR BCMB/CHEM 8190

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

Advanced Macroeconomics II: Exchange economy

ABSTRACT KEYWORDS. Bonus-malus systems, frequency component, severity component. 1. INTRODUCTION

Problem 1 / 25 Problem 2 / 15 Problem 3 / 15 Problem 4 / 20 Problem 5 / 25 TOTAL / 100

Polymerization Technology Laboratory Course

FTCS Solution to the Heat Equation

P R = P 0. The system is shown on the next figure:

The Finite Element Method for the Analysis of Non-Linear and Dynamic Systems

The Impact of SGX MSCI Taiwan Index Futures on the Volatility. of the Taiwan Stock Market: An EGARCH Approach

Existence and Uniqueness Results for Random Impulsive Integro-Differential Equation

An introduction to Support Vector Machine

2 Aggregate demand in partial equilibrium static framework

ELASTIC MODULUS ESTIMATION OF CHOPPED CARBON FIBER TAPE REINFORCED THERMOPLASTICS USING THE MONTE CARLO SIMULATION

Introduction to Boosting

Should Exact Index Numbers have Standard Errors? Theory and Application to Asian Growth

CH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC

Estimation of Cost and. Albert Banal-Estanol

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005

Political Economy of Institutions and Development: Problem Set 2 Due Date: Thursday, March 15, 2019.

Robustness of DEWMA versus EWMA Control Charts to Non-Normal Processes

Chapter 9: Factor pricing models. Asset Pricing Zheng Zhenlong

Forecasting customer behaviour in a multi-service financial organisation: a profitability perspective

[ ] 2. [ ]3 + (Δx i + Δx i 1 ) / 2. Δx i-1 Δx i Δx i+1. TPG4160 Reservoir Simulation 2018 Lecture note 3. page 1 of 5

Li An-Ping. Beijing , P.R.China

Bernoulli process with 282 ky periodicity is detected in the R-N reversals of the earth s magnetic field

Performance Analysis for a Network having Standby Redundant Unit with Waiting in Repair

Transcription:

Jean-Perre Laffargue Page 5/05/04 AN INTRODUCTION TO APPLIED ECONOMETRICS (Lecure noes) Jean-Perre Laffargue Unversy of Pars, PSE and CEPREMAP

Jean-Perre Laffargue Page 5/05/04 CONTENTS Inroducon Chaper. Descrpve sascs Graphs Mean and oher numercal summares The concep of robus summary Chaper. Bvarae analyss Correlaon An nroducon o smple regresson Sascal aspecs of regressons Robusness of regresson Tuoral: he capal asse prcng model (CAPM) (Bernd, chaper ) Chaper 3. Mulvarae analyss Mulple regressons Tuoral: Coss, learnng curves and scales economes (Bernd, chaper 3) Paral correlaon Fragly analyss Regressons wh dummy varables Tuoral: Analysng he deermnans of wages and measurng wage dscrmnaon (Bernd, chaper 5) A las example as a concluson Chaper 4. The economerc of me seres Regresson wh me lags: dsrbued lag models Unvarae me seres analyss: he auoregressve model of order (he AR() model) Unvarae me seres analyss: he auoregressve model of order p (AR(p) model) Regresson wh me seres varables : he case wh saonary varables Regresson wh me seres varables: he case wh non-saonary varables: spurous regressons Regresson wh me seres varables: he case wh non-saonary varables: Conegraon Chaper 5. Exogeney An example The concep of weak exogeney Commen Idenfcaon and esmaon How o es he weak exogeney of a varable: he Haussman es Concluson Spécfcaon, esmaon e ess : l approche de la Cowle Commsson Une premère phlosophe alernave : du général au spécfque (Davd Hendry) Une deuxème phlosophe alernave : l analyse exploraore des données (EDA) Une rosème phlosophe : l analyse de fraglé ou de sensblé (Leamer) Concluson

Jean-Perre Laffargue Page 3 5/05/04 INTRODUCTION These noes are nended for sudens havng no knowledge n economercs and lle knowledge n sascs and n probably. For a long me, he radon n France was o each economercs he hard way. In a frs sage, sudens had o learn pleny of mahemacal resuls on varous classes of esmaors and ess. Durng hs me hey had o beleve ha her ard nvesmen wll be profable n he fuure and wll allow hem o deal wh economc daa and o answer economc problems. Laer on hey could urn o applcaons. A problem wh hs mehod s ha many sudens became dscouraged durng he frs sep of he process. Anoher problem was ha many sudens who had reached he second sep had a endency o urn o very sophscaed and fragle mehods when hey faced smple praccal problems. Somemes, he resuls hey reached were crazy when hey mxed very complcaed mehods, wh very elemenary msakes conradcng basc common sense. The mos serous msakes n economercs, whch can even be found n arcles publshed by good journals, come from no spendng me enough lookng a he daa, n a pragmac way, whou a pror and whou he srong desre o apply complcaed mehods ha are oally napproprae for hese daa. These noes follow a compleely dfferen prncple. I wll nroduce economercs hrough a seres of smple applcaons. I wll use lle mahemacs, and I wll be lle rgorous. I wll appeal o he common sense and he nuon of he reader o nroduce he basc conceps, mehods and raps of economercs. I wll ry o show ha economercs s smple, and hnkng n an economerc way s he same as hnkng n an economc way. Somemes, he developmens wll be a b rcky, and I hope as funny as he knd of rddles and puzzles you can fnd n newspapers and magaznes. The book by Bernd (quoed among he references) s eneranng and pleasan o read (wh much gossp on he professon, so you can dscover ha economercans are also human bengs). Fnally, economerc mehods gve answers o economc quesons, and hese answers mus be undersandable and look convncng o people who are expers of hese quesons and no economercans. So, nroducng sudens o economercs hrough applcaons s sensble. There s a lm o he approach followed n hese noes, and sudens are expeced o feel more and more when hey progress n hs course. Examples and nuon quckly mee her lms and o go furher we mus use logcal and rgorous mehods. So mahemacs s unavodable, and, afer havng read hese noes sudens mus learn a book of economercs, whch ncludes he mahemacal foundaons of hs feld. However, dong ha n a second sage of learnng, afer havng gone hrough hese noes, wll be a ask much easer han sarng drecly wh he mahemacs of economercs. There are many user-frendly economerc sofware. I wll advse you o use E-Vews or Saa. Boh are known and used n he whole world and s much wser o learn and use a sofware ha s a world sandard. Loosely speakng E-Vews s well adaped o macroeconomcs (me seres-daa) and Saa o mcroeconomcs (ndvdual daa). However, he avalably of such user-frendly sofware may

Jean-Perre Laffargue Page 4 5/05/04 encourage lazness and he absence of reflecon. Ths s a py because hese sofware nclude powerful graphc capaces, and pleny of descrpve sascs, whch are exremely precous o look a he daa and o learn much of hem. Alexander Pope wroe: Lle learnng s a bad hng. Henr de Monfred ells a nce sory. When he was smugglng weapons on he Red Sea, he had a good frend, who was a polceman and he only French offcal on a small sland near Djbou. He old of hs frend ha he had he wsdom of men of he people who had no been spoled ye by compulsory publc educaon. Boh commens are a b arrogan and even reaconary, bu hey are bascally rue. Many appled economercans use sophscaed mehods, whch were developed by expers n he feld, and hey apply hem o her problems and daa whou furher reflecon. Ths maon process has become very easy wh he exsng sofware and he avalably of many programs on he webses of Evews and Saa. In general hese mehods were correc for he problem hey were desgned for, and her developers dd no make msakes n her applcaons. Bu he adapaon of hese mehods o problems hey were no desgned for can be awfully wrong. So, hese appled economercans make logcal msakes and draw slly conclusons. In my lfeme I read many srange papers of appled economercs wh resuls, whch were meanngless, ncomprehensble and unbelevable. These papers generally were n developmen economcs and macroeconomcs, bu hs can resul from he fac ha mos of my readngs are n hese felds. These appled economercans had a superfcal knowledge of heorecal economercs and red o subsue recpes o logc. So, hey had lle learnng. However, hey could have avoded her msakes f hey had no los her common sense, he wsdom of he gnoran. These noes are que nsuffcen o help you o solve he frs problem. However, hey wll gve you advce o help you no o progressvely lose your common sense when you become more and more learned. These advces can be summed up n ow senences. Frs, do no forge ha you are an economs and ha your economerc resuls mus be explanable n plan French (or Englsh) and whou cheang, o a non-economercan economs. Secondly, look carefully a he daa and do no apply a mehod based on assumpons, whch are conradced by hese daa. In summary, economercs mus no make you lose your common sense. Economercs s a se of quanave ools for analysng economc daa. Economss need o use economc daa for hree reasons: ) o decde beween compeng heores; ) o predc he effec of polcy changes; 3) o forecas wha may happen n he fuure. Three examples: Have PC ncreased he producvy of clerks and secreares; How o evaluae and compare he effcency of varous polces agans AID n Afrca; How o forecas he demand for publc ransporaon n a bg cy? Economss deal wh dfferen knds of daa:. Tme seres daa. For nsance GDP daa are colleced every quarer. Macroeconomcs and fnance use such daa. In macroeconomcs frequences are If you wan o wre sophscaed programs, o solve compuaonal problems, you d beer use Gauss or Malab. Ths nsuffcency can be combned wh sragh dshonesy. A complcaed scenfc mehod can be manpulaed o defend conclusons, whch agree wh your deology or your neres. Ths unforunae suaon especally happens n felds where here are hard polcal and deologcal debaes and conflcs.

Jean-Perre Laffargue Page 5 5/05/04 annual, quarerly or monhly. Frequences are much shorer n fnance. I wll use he followng mahemacal noaon for a varable or seres: Y, wh: +,,.., T. T represens he number of observaons.. Cross-seconal daa. For nsance n a labour survey you nervew 000 workers of he chemsry ndusry on her wages, her labour condons, ec. All hese nervews ake place a abou he same dae. Each queson gves you as many answers as nervewed workers. Le us ake for nsance wages. I wll use he followng mahemacal noaon: Y, wh: =,,.., I. Y represens he wages of worker, and I s he number of surveyed workers. Cross-seconal daa are manly me n mcroeconomcs (observaons can bear on workers, households or frms). Bu, macroeconomcs can use such daa when compares dfferen counres (for nsance her GDP per head). Tme seres daa and cross-seconal daa dffer on a very mporan pon. Tme s orened. The pas comes before he fuure. You can use he pas o forecas he fuure, bu you canno use he fuure o forecas he pas. Of course, he pas and he presen depend on he expecaons of he fuure by economc agens. Bu, he expeced fuure s based on he experenced pas, and no on he rue fuure whch s unknown. On he oher hand, here s no naural way for orenng cross-seconal daa. Because of s specfcy he economercs of me seres daa s a b specal (and n my opnon s he mos dffcul par of economercs). 3. Panel daa. In he above example you can rack he same workers for several years and nervew hem perodcally over hs perod of me. For nsance, you can nervew he same people on her wages every year for fve years. I wll use he followng mahemacal noaon for he wages of worker for year : Y, wh: =,,.., I and =,,..., T. There s a specal feld of economercs o deal wh hs knd of daa. In mos case, you wll have a large number of ndvdual uns, and a small number of me perods (5 for example). Ths s he case n my example, and n mos of mcroeconomcs. Some macroeconomss use o work on a panel of counres or regonal areas: for nsance yearly daa of 30 OECD counres on he 970-000 perod. Ther problem s a b dfferen: a smaller number of ndvduals (30 nsead of 000) and a larger number of perods (3 nsead of 5). I s a conroversal queson f he radonal economercs of panel daa s approprae o he knd problems macroeconomss face. The radonal economerc of panels can be a b rcky and cumbersome, bu s que conssen wh nuon and common sense, and he mahemacs uses s elemenary. 4. Quanave daa and qualave daa. GDP s a quanave daa: akes real values, for nsance 900 housand mllon dollars. Bu some oher daa can ake only wo values, whch n general are no numbers. Examples of such daa are genders (male or female), f a worker kep or los hs job for a gven year, f a household owns or does no own a car. A dfference beween hese wo knds of daa s ha quanave daa have a naural order: havng a large GDP s beer han havng a small GDP. Bu he qualave daa above have no naural order. A more complcaed case s when you consder qualave daa, whch can ake more han wo values. For nsance a household can own no car, car, more han car. In hs case a naural order appears among he hree values (bu I could buld examples when such an order would no appear, lke spendng your vacaon n he counrysde, n he

Jean-Perre Laffargue Page 6 5/05/04 mounans or on he seasde). Somemes, a qualave order appears, bu you canno make any quanave comparsons beween he possble values aken by he varable. For nsance, you can survey Afrcan newspapers o dscover f, afer an IMF nervenon, you wll have: nohng, a bg srke, bg ros, or a revoluon. A revoluon s worse han a ro, bu you canno ell f s hree mes or fve mes worse. On he oher hand you can ell ha French people are 8 mes wealher han Tunsan people. However, somemes quanave comparsons can be made for qualave daa, for nsance f hey represen he number of paens appled and receved by frms. The economercs of qualave daa, also called lmed dependen varable, s a popular feld of economercs. The bass s smple o undersand and o apply. Bu hs smplcy dsappears quckly when you ry o go a lle furher han he mos elemenary cases. When your analyse daa, you quckly dscover ha s somemes beer o ransform hem. For nsance, f you wan o compare he qualy of lfe beween several counres, you should dvde your daa by he populaons of he assocaed counres. Thus, you wll compare he numbers of medcal docors per 000 people. A useful, bu rcky, ransformaon s o go from a varable n level (GDP for nsance) o he growh rae of hs varable. The growh rae of varable Y wll be denoed: Y & = ( Y Y ) / Y. Somemes, economss prefer he approxmaon: Y & = ln( Y ) ln( Y ). You can mulply hese formulae by 00 f you lke havng your growh rae n percenage pons. You can noce ha GDP and he growh rae of GDP do no have he same uns. GDP s n dollars or euros. Is growh rae s a pure number, for nsance 0.03 or 3%. Moreover, GDP has a rend, bu s growh rae has no rend. You mus no forge hese dfferences when you wre an economc equaon. For nsance s sensble (and Keynesan) o wre ha consumpon ncreases wh ncome. Bu s queer o wre ha he growh rae of consumpon ncreases wh ncome (and Keynes never clamed such a hng). A las ransformaon s o subsue a varable n level by s naural logarhm. Ths has wo advanages. Frs, he values aken by he log are much smaller and vary on a much smaller range. Secondly, when you draw a graph of he evoluon on he varable relavely o me, f grows a a consan rae, you ge an exponenal for he varable n level bu a sragh lne for he varable n logarhm References Marno Verbeek: Modern Economercs, nd edon, John Wley, 004. Ths s he book I advse sudens o buy and o read as soon as hey have been hrough my course. In 400 pages covers he whole feld of economercs a he nroducory and medum-advanced levels (so he book can be used as a uoral and a reference). I mxes economerc heory and appled economercs, wh pleny of examples on real daa and neresng problems. The daa can be unloaded from he webse of he auhor. The heorecal elemens are well explaned, whou excessve use of absrac mahemacs, bu wh much precson and whou msakes. A dffculy wh basc economercs s ha many elemenary resuls are unmporan for applcaons and should be skpped n a frs course. However, sophscaed and recen resuls are somemes essenal for applcaons and mus be nroduced n a frs course, bu n a smplfed way. The auhor of hs book succeeds well n dong

Jean-Perre Laffargue Page 7 5/05/04 ha. Some quesons, whch I hnk mporan for praconers, robus economercs and he foundaons of he conceps of exogeney, are absen of hs book. However, oher conceps, whch are also very mporan for praconers, do no appear n my noes, bu are well developed n he book. The auhor of hs book akes srong posons on he varous mehods he presens, and gves advses on he msakes many praconers do when hey apply hese mehods. The frs half he book s abou general economercs. The second half presens he economercs of specfc felds: lmed dependan varables, me seres, and panel daa). So, I advse sudens o buy hs book (orders hrough amazon.co.uk are execued promply, do no ry cheaper bu unrelable booksellers on lne). Peer Kennedy: A Gude o Economercs, 4 h edon, The MIT Press, 998. Ths book s for praconers and s cenred on he msakes o avod when you do appled economercs. Is chapers are dvded no hree pars. The frs, wren wh large leers, presens very elemenary hngs. The second, wren wh medum-szed leers, s more advanced. The hrd, wren wh small leers, s very advanced. There are no mahemacal developmens, bu here are many heorecal resuls, some formulae, and mahemacs are never very far n he background. Thus, hs book s more advanced han my noes (and much more rgorous). There have been (a leas) four edons, each much rcher han he prevous one. Thus, you can see ha hs book me a remendous success n he whole world. Gary Koop: Analyss of Economc Daa John Wley and Sons, Chcheser, 000. Ths s a book wren for sudens n busness, who hae mahemacs. Many deas of my noes were aken from hs volume Chandan Mukherjee, Howard Whe and Marc Wuys: Economercs and daa analyss for developng counres, Rouledge, London and New York, 998. The frs verson of my course was based on hs book. The pars of my noes on robus regressons are mosly aken from. Erns R. Bernd: The Pracce of Economercs, Addson-Wesley, Readng, 99. Each chaper presens he hsory of an economc problem (he capal asse prcng model, hedonc prces, ec.) and of he soluons appled economss gave o hs problem. Each chaper ends wh problems usng he daa used by hese economss and askng o reproduce her resuls or varans of her resuls. Moreover, each chaper deals wh a specfc economerc problem (bvarae analyss, mulvarae analyss, ec.). The book s exremely pleasan o read and very lvely. One or wo chapers (hose conneced o me seres economercs and macroeconomcs) have become a lle old fashoned, bu he ohers are sll fully acual. The uoral manual of E-Vews s an excellen book of appled economercs. You wll fnd n he help of he sofwarefor versons 4 and 5. You wll fnd an excellen course eachng how o use Saa on Souh Afrcan survey daa on he webse hp://saprojec.psc.sr.umch.edu/. Fnally, f you wan o go furher han hese noes, I advse you o aend one or several courses n heorecal economercs. In hs feld self learnng s dffcul, even

Jean-Perre Laffargue Page 8 5/05/04 wh he excellen books I quoed above.

Jean-Perre Laffargue Page 9 5/05/04 Graphs CHAPTER. DESCRIPTIVE STATISTICS When you have daa, he frs hng o do s o look a hem by drawng well-chosen graphs. Excel and E-Vews are complemenary nsrumens o do ha. We wll go hrough a few examples. Frs, Koop\exruk.xls, gves monhly me seres daa from January 947 hrough Ocober 996 of he UK pound / US dollar exchange rae. Draw he me seres graph. Commen Second, Koop\gdppc.xls, conans cross-seconal daa on real GDP per capa n 99 for 90 counres n US dollars usng PPP exchange raes. Draw he hsogram. Draw he kernel densy. Noce he bmodal dsrbuon of GDP per head. Thrd, Koop\fores.xls, conans daa on deforesaon, and on populaon densy for 70 ropcal counres. Deforesaon s he average annual fores loss over he perod 98-990 expressed as a percenage of he oal foresed area. Populaon densy s he number of people per housands of hecares. Draw he scaer dagram beween hese wo varables. Noce he posve relaonshp, and he oulers. Mean and oher numercal summares Somemes, you would lke o sum up he dsrbuon of a varable, for nsance he above hsogram, by a few numbers. There are wo radonal summares. The mean ndcaes he value around whch all he values aken by he varable are equally dsrbued. The formula s N Y = ( Y ) / N, N s he number of observaons. The = / ( Y Y ) /( N ) N sandard devaon s defned by he formula: s = = measure of dsperson. Compue hese wo ndcaors on Koop\gdpp.xls.. I s a Now, we should hnk a lle harder abou he prevous conceps. A hsogram represens he dsrbuon of a sample of observaons of a random varable. The hsogram s a reflecon of he rue dsrbuon funcon of he random varable. The frs feaure of hs funcon s he level around whch s locaed, for nsance 300 dollars or 30000 dollars. The concep of mean s a choce (among ohers) of a measure of hs locaon. Somemes hs choce s good. Somemes s bad. I wll laer nroduce he equaon gven by a lnear regresson. Such an equaon wll deermne he mean (or expeced value) of he explaned or dependen varable, condonal on he knowledge of he explanaory varables. Thus, he concep of mean s cenral o economercs, and we mus undersand s lms. There exs hree measures of he locaon of he dsrbuon of a varable: he mode, whch s he value he mos ofen observed, he medan, whch s he value such ha he larger values are as many as he smaller values (he cenre of probably) and he mean (he cenre of gravy).

Jean-Perre Laffargue Page 0 5/05/04 Some dsrbuons are bmodal (for nsance GDP per capa n he above example). In hs case he mean s a bad summary. Very ofen a bmodal dsrbuon suggess ha he sample ncludes wo knds of ndvduals whch should be consdered separaely, a leas a he begnnng of he analyss. For nsance you can have such a bmodal dsrbuon for he wages of a sample of workers. Ths may resul from he fac ha women are unformly less pad han men. Some dsrbuons are much skewed o he rgh, for nsance he paymen of overhead hours o a sample of workers over dfferen weeks. Then, he mode s smaller han he medan, whch s smaller han he mean. When he dsrbuon has he shape of a bell (so s unmodal) and s symmerc, he mode, he medan and he mean are equal. In hs case, s jusfed o compue he mean. Then, he arhmec mean compued on a sample of observaons s a BLUE esmaor of he expeced value of he underlyng dsrbuon. Moreover, f he dsrbuon s normal, he arhmec mean s he maxmum lkelhood esmaor of he expeced value of he underlyng dsrbuon. As he medan and he mean are equal, we could hnk abou esmang her common value, no by he arhmec mean of he sample, bu by s emprcal medan. Ths would be a bad dea. In he case of a normal dsrbuon and for a large sample, he sandard devaon of he esmaor medan s.5 hgher han he sandard devaon of he esmaor arhmec mean. Ths means ha he emprcal medan s a less precse esmaor han he emprcal mean. Bu for skewed dsrbuons for whch he mean has lle meanng (for nsance for wages), hen he emprcal medan deserves o be compued and looked a. The concep of robus summary I wll consder he GDP per capa of a sample of sub-saharan Afrcan counres. The frs sample ncludes 7 counres and concludes ha n 990 GDP per capa had a mean of $354, a medan of $370 and a sandard devaon of $96. The mean appears as an neresng summary of hs dsrbuon. Now, n hs sample, I wll subsue Boswana o Lesoho. Then, he mean becomes equal o $570, he medan remans equal o $370 and he sandard devaon becomes equal o $673. Thus, by changng only one ndvdual no he sample, I dd no change he medan, bu I changed he mean a lo. So, he mean s no a robus ndcaor of GDP per capa n Afrcan counres. Bu he medan s a robus ndcaor. There s an economc and a polcal dmenson n hs dscusson. If we fx o 500 dollars he level under whch a counry s consdered o be poor, he mean ndcaes ha Afrcan counres are above hs povery level. The medan gves he oppose concluson. The problem wh he mean s ha mnmses he mean square error (explan). So, an ouler has an excessve wegh n s deermnaon. I consder he dsrbuon of a random varable wh mean μ and varance (he square of he sandard devaon) σ. These ndexes are esmaed by he emprcal mean and varance Y and s. However, besdes μ and σ here exss wo oher mporan ndces summarsng he dsrbuon of he random varable, and whch are compued by usng he operaor expeced value, denoed E:

Jean-Perre Laffargue Page 5/05/04 E( Y μ) The skewness s: α 3 = 3 σ 4 E( Y μ) The kuross s: α 4 = 4 σ 3 Skewness s zero for a symmerc dsrbuon (hence for a normal dsrbuon). Kuross s equal o 3 for a normal dsrbuon. When he kuross s larger han 3 we wll say ha he dsrbuon has a fa al. If you remember he formula of he normal dsrbuon you can noce ha he value of hs dsrbuon decreases as he square of he nverse of an exponenal when we go furher and furher from he mean. Ths means ha hs decrease s very fas, and, for a sample wh a reasonable sze he probably of havng one observaon or more farher han 4 sandard devaons from he mean, s praccally zero. Varance, skewness and kuross are an arhmec average, ha s hey ake no accoun he value aken by he varable for each ndvdual of he sample. Hence, an ouler, ha s an observaon wh a value far from he mean, wll srongly affec hese hree ndces. If I wan o remove he perverse nfluence of oulers on he numercal ndces whch sum up he dsrbuon of he nvesgaed varable, I can compue oher ndces, whch do no ake no accoun he values aken by hs varable for each observaon, bu only he rank of hs value n he sample. To do ha, I wll rank he observaons n ncreasng order. The medan s he observaon, whch dvdes he sample no wo pars of equal sze. The lower quarle Q L s he medan of he par of he sample, whch s locaed below he rue medan. The upper quarle Q U s he medan of he par of he sample, whch s locaed above he rue medan. These defnons mus be refned o ake no accoun ha, for nsance f he number of observaons s even, here are wo observaons whch apply o be he medan. In hs case we wll ake her mean as he medan. The same problem occurs for quarles. The range s he dfference beween he hghes value and he lowes value observed n he sample. I s very sensve o oulers. However, he ner-quarle range IQR s he dfference beween he upper quarle and he lower quarle QU QL. I s no sensve o oulers. An ouler s a pon ha s locaed very far ousde he IQR. For nsance Y O can be consdered as an ouler f: YO < QL. 5IQR, or YO > QU +. 5IQR We can decrease or ncrease coeffcen.5 o defne near-oulers and far-oulers. How can we es f a varable s dsrbued accordng o a normal law? The frs hng o es s he symmery of he dsrbuon ha s he absence of skewness. The mos naural es s o compare f he dfference beween he mean and he medan s large. The mporance of hs dfference s evaluaed relavely o he IQR. Ths dfference may be large, no because he dsrbuon s asymmerc, bu because of a small number of oulers. So, s neresng o measure f he value of he medan s near he mean of he wo quarles. If hs s he case, he cenral values of

Jean-Perre Laffargue Page 5/05/04 he dsrbuon are symmercally dsrbued, and we can expec o have a symmerc dsrbuon afer havng elmnaed oulers. A robus ndex of skewness s he coeffcen of Bewley: b = ( Q + Q Md) IQR s U L / The second hng o es s kuross. For a normal dsrbuon we have ο = IQR /. 35. So, we can evaluae n a way robus o oulers, f he al of he dsrbuon s fa or hn by compung he dfference beween he emprcal sandard devaon and IQR dvded by.35. When he al of he dsrbuon s fa, he emprcal medan becomes a beer esmaor of he mean han he emprcal mean, whch becomes very sensve o oulers. We can buld a 95% confdence nerval of he mean as gong from rank n eger(( n + ) / n) o rank n eger (( n + ) / + n). The resuls of hese formulae mus be rounded up o he nearer, lower neger and o he nearer, larger neger, respecvely. The Jarque and Bera es s more sophscaed bu canno dscrmnae beween he presence of oulers and a rue asymmery or rue fa als. I mus compue Z 3 = a3 n / 6 and Z 4 = ( a 4 3) n / 4 where a s he esmaor of de α. These wo expressons follow sandard normal dsrbuons. Moreover, for bg samples (000), hese wo expressons become ndependen. Thus he sum of her squares follows a χ. Ths s he Jarque and Bera es. To remove rue skewness from a seres we can ransform : 3 Y reduces exreme negave skewness Y reduces negaves skewness. log(y ) reduces posve skewness. /Y reduces exreme posve skewness. If he ransformaon s well made he mean and he medan of he ransformed seres wll approxmaely be equal. As hs ransformaon preserves he order of he observaons, he nverse ransformaon of he mean-medan wll gve he medan of he orgnal seres, bu no s mean. In he same way he applcaon of he nverse ransformaon o he confdence nerval wll gve he confdence nerval of he medan of he orgnal seres. Ths does no maer very much, because n case of skewness, he mean does no have grea economc meanng for he orgnal varable.

Jean-Perre Laffargue Page 3 5/05/04 Correlaon CHAPTER. BIVARIATE ANALYSIS Le us come back o he fle Mukherjee\fores.xls. The scaer dagram beween deforesaon and populaon densy shows ha hese varables are relaed, bu ha he relaonshp s mperfec. If we draw a lne n he mddle of he scaer dagram, we can see ha he pons of he dagram are dsrbued around hs lne, bu hey are no on he lne. The correlaon beween he wo varables s a number, ncluded beween and, whch measures he nensy of her relaonshp. Ths nensy s very srong f he correlaon s near or. I s very weak f hs nensy s near 0. More precsely he square of he correlaon measures he proporon of he crosscounry varably n deforesaon ha maches up wh he varance n populaon densy. In our fores example, he correlaon coeffcen s equal o 0.66. As 0.66 = 0.44, we can say ha 44% of he cross-counry varance n deforesaon can be explaned by he cross-counry varance n populaon densy. I wll spend he res of he paragraph hnkng abou wha "explanng" means. Somemes I wll use he verb o cause nsead of o explan. Boh words wll have he same meanng. Ths meanng wll be he general meanng ha people gve o hese words. Economercans defned Granger-causaly. Ths s que a specfc and echncal concep, and I wll no use n hs chaper. My frs example wll be abou an exercse, whch s gven n Koop\hprce.xls. Ths fle conans daa relang o 546 houses n Wndsor (Canada) n he summer of 987. I conans he sellng prce (n Canadan dollars) along wh many characerscs for each house. Frs, I can fnd a correlaon of 0.54 beween he prce and he sze of s lo. Thus, we can hnk ha he sze of he lo causes he prce of he house, whch s locaed on. If, I have a house and f I buy some land besdes, s prce wll ncrease. Second, I fnd a correlaon of 0.37 beween he prce of a house and he number of s bedrooms. Thus, I can hnk ha a house wh 4 bedrooms wll have a hgher prce han a house wh 3 bedrooms. Thrd, I fnd a correlaon of 0.5 beween he sze of he lo and he number of bedrooms of a house. Ths number s surprsngly low. I would have hough ha bg houses ofen go wh large los. Ths relaonshp exss, bu s prey weak. Now, le us hnk abou he srong relaonshp beween he prce of a house and he sze of s lo. One reason for hs relaonshp s ha a house wh a bg garden wll have a hgher prce han a house whou a garden. A second reason s ha for a house, large los are (weakly) conneced wh large numbers of bedrooms, and buyers are ready o pay for a large number of bedrooms. Thus, here s somehng spurous abou he srong correlaon of 0.54 found beween prce and sze of he lo: f I buy some land besdes my house, whou addng one more bedroom, s prce mgh ncrease by less han expeced. Ths s he essence of mulvarae regressons, whch wll be consdered n nex chaper. However, hs s also he essence of he man dffculy, whch s me by appled economercans.

Jean-Perre Laffargue Page 4 5/05/04 I wll develop hs las dea, whch may be called drec agans ndrec causaly, on anoher example. We can fnd a srong correlaon beween holdng a unversy degree and pay. However, does ha mean ha educaon ncreases producvy and earnng as he heory of he human capal assumes? The srong correlaon could be because people wh a unversy degree are nellgen, and ha frms are ready o pay hgh wages o nellgen people. In mcroeconomcs hs s called he heory of screenng, whch s par of he heory of sgnallng. The heory of he human capal consders ha here s a drec causaly from educaon o pay. The heory of screenng consders ha here s an ndrec causaly from nellgence o pay, whch passes hrough educaon. However, educaon by self would be useless. Economercans developed pleny of rcky mehods o deermne whch, of hese wo heores s rgh. You can magne ha usng nellgence ess o dscrmnae beween hese heores s conroversal: such ess are sensve o socal or ehnc backgrounds and are no a wholly convncng measure of nellgence. However, you can buld your economercs, for nsance, by comparng wns wh dfferen levels of educaon (one of hem dd no go o unversy because of a car accden ). The las example. If you ake a sample of people, you wll fnd a srong correlaon beween he number of cgarees each person smokes per week, and on wheher hey have lung cancer. Ths resul s normal, because smokng causes cancer. You wll also fnd srong correlaon beween he number of cgarees smoked every week and he amoun of alcohol drunk n a ypcal week. Ths resul s also normal and may be relaed o socal aude: here exs people who do no care much abou nuron and who lke spendng her evenngs n pubs. These people drnk, smoke and ea much fa. Of course, you wll also fnd a srong correlaon beween drnkng alcohol and havng a lung cancer. However, hs correlaon s spurous: drnkng does no cause lung cancer. Only, people who drnk a lo use o smoke a lo, and smokng causes cancer. My concluson s ha correlaon s a very helpful ool when you wan o analyse a problem. However, correlaon (and economercs) s que nsuffcen by self. You sll have o make a clever analyss of he problem, usng common sense and a few well-hough rcks. Compuers are no subsues for human nellgence. Compue he correlaon marx of koop\corma.xls. The correlaon beween varables X and Y s gven by: r = N ( Y Y )( X X ) N N ( ) ( ) Y Y X X = = =. X and Y appear n a symmerc way n he formula: he correlaon beween X and Y s he same as he correlaon beween Y and X. Ths s anoher way o consder he ambguous meanng of correlaon: f s hgh, does X cause Y or Y cause X? An nroducon o smple regresson In he example wh hprce.xls, we drew a scaer dagram of he prce of a house relavely o he sze of he lo where s locaed. We found a posve relaonshp

Jean-Perre Laffargue Page 5 5/05/04 beween hese wo varables. Le us call X he sze of he lo and Y he prce of he house. Ths relaonshp can be wren: Y = α + βx Ths s he equaon of he regresson lne of Y relavely o X. X s he explanaory varable. Y s he explaned or dependen varable. α and β are he coeffcens or he parameers of he regresson lne. You remember he lne he compuer drew n he mddle of he scaer dagram. Now, hs relaonshp s only approxmaely rue: he observaons for each house are dsrbued around hs lne, bu here are no on he lne. Thus, he rue equaon s: Y = α + βx + ε ε s called he error erm. For some houses s posve. For ohers s negave. Somemes s large. Somemes s small. Thus, represens he fac ha he regresson lne s only an approxmaon of ruh. The error erm sums up all he omed varables: he number of rooms, he qualy of he dsrc, he ably of he seller o ge a good prce, ec. Now, he economercan does no know he rue values of parameers α and β. He wll use sascs o nfer esmaes of hese values based on he observaon of he daa he has for 546 houses. Of course, hese esmaes wll be a lle wrong: hey wll dffer from he rue values α and β. We wll denoe hem: αˆ and βˆ. The rue equaon can be subsued by he esmaed equaon: Y = ˆ α + ˆ βx + u There are wo dfferences beween he esmaed and he rue equaons. Frs, he rue values of parameers are subsued by her esmaed values. Second, he error erm ε s subsued by he resdual u. The resdual cumulaes all he approxmaons n he rue equaon, whch are ncluded n he error erm ε, plus he error resulng from he approxmaon of he rue values of he parameers by her esmaes. How do economercans esmae parameers? They wan he esmaed regresson lne well n he mddle of he scaer dagram. Or, o be more precse, hey wan o mnmse he values of he resduals of he equaon. If, I call u, he resdual for house, a way o measure he mporance of he resduals s o compue he sum of he squared resduals: SSR = N u. = The mos popular way o compue esmaes αˆ and βˆ s o look for he values, whch mnmse he SSR. Ths mehod of esmaon s called ordnary leas square (OLS). I N ( )( ) Y Y X X = s easy o compue he formulae of αˆ and βˆ. The resul s ˆ β =, N r X X ˆ α = Y βx ˆ. ( ) =

Jean-Perre Laffargue Page 6 5/05/04 OLS esmaon, and he mnmsaon of he SSR, s a very popular mehod n economercs. However, has a few dsadvanages, whch wll be consdered laer on. In he example of hprce.xls, I found: ˆ α = 3436, and: ˆ β = 6. 60. αˆ has no smple nerpreaon. The resul for βˆ means ha f you ncrease he sze of your lo by square foo, he prce of your house wll ncrease by C$ 6.60. However, you mus remember wo hngs. Frs, he equaon s an approxmaon. Secondly, he relaonshp beween he sze and prce may be parly spurous and reflec, for nsance, ha large los are ofen assocaed wh a large number of rooms. Le us consder house. The esmaed equaon explans s prce by: Y = ˆ α + ˆ βx + u. Y s he rue prce of house. If hs prce was on he esmaed regresson lne would be: Yˆ = ˆ α + ˆ βx. Yˆ represens he fed or predced value of he rue prce of house. Acually, hs prce can be decomposed beween he f and he resdual of he equaon: Y = Y ˆ + u. E-Vews compues he f and he resdual for each house. N The oal sum of he squares of he prces of houses s gven by ( ) = Y Y Acually, dvded by N N of squares s defned by ( ˆ ) TSS., represens he varance of hs prce. The regresson sum RSS = Y Y. Dvded by N represens he = varance of he f of he prce. We can prove ha: TSS = RSS + SSR. Ths means ha he dsperson of prces s he sum of he dsperson of he fed values of he equaon and of he dsperson of he resduals. The precson of he equaon or he qualy of s f are beer f SSR s relavely small and RSS s relavely hgh. Thus, s naural o measure he qualy of he f by he correlaon coeffcen R = RSS / TSS. Ths coeffcen s ncluded beween 0 and. In he bvarae case nvesgaed n hs chaper, he correlaon coeffcen s equal o he correlaon coeffcen beween he wo varables. Unl now I have consdered a lnear relaonshp beween Y and X: Y = α + βx. There are many cases when Y and X are srongly relaed, bu n a non-lnear way. For nsance Y = α + βx. In hs case, we subsue X o X and we proceed exacly as before. Thus, we jus ransform varables n an adequae way o ge a lnear regresson model wh he ransformed varables. The mos popular ransformaon s he naural logarhm ransformaon: ln( Y ) = α + β ln( X ). To undersand s meanngs consder he relaonshp beween households consumpon C and households ncome Y. Accordng o Keynes we should have: C = a + cy. Keynes called c he margnal propensy o consume and consdered o be posve bu smaller han. However, we can ge a beer f wh he regresson equaon: ln( C ) = b + d ln( Y ). d s he elascy of consumpon relavely o ncome: f ncome ncreases by %, consumpon wll ncrease by d%. Economercans generally fnd d o be near o. Ths s a homogeney consran =

Jean-Perre Laffargue Page 7 5/05/04 whch mus be sasfed f you do no wan o fnd queer dfferences beween counres of dfferen szes. Mos macroeconomc seres have a geomerc rend f hey are no ransformed. Ther logarhm exhbs a lnear rend (whch s easer o analyse on a graph). The frs dfference of a seres n logarhm s he growh rae of he orgnal seres. Thus, macroeconomss love o work wh seres n logarhms and wh log-lnear funcons. However, a few seres should no be ransformed n hs way, for nsance, hose whch flucuae around 0 (he logarhm of 0 s mnus nfny). Examples of such seres are neres raes, nflaon raes and he rae of unemploymen. We can hnk a lle b more on non-lneary and common sense wh he daa of mukherjee\mapuo. Each mornng he auhores of he harbour of Mapuo declare ha hey have he number DEMD of posons of dockers o fll for he day. Of course, hs number vares wh he numbers of shps arrvals o he harbour. The number of dockers hred for he day s RECD. Of course, hs number canno be larger han DEMD, and depends on he number of dockers who are ready o work hs day. We have daa on DEMD and RECD for 400 successve days. Frs, draw he hsograms of he wo seres. Then, regress he frs seres on he second and a consan erm. The resul looks apparenly economercally good. Is no economcally slly? Draw he scaer dagram of boh varables and hnk a lle b. Of course RECD DEMD and he dfference beween boh varables ncreases wh DEMD. Sascal aspecs of regressons I wll consder agan he relaonshp beween he prce of a house and he sze of he lo where s locaed. The rue relaonshp s: Y = α + βx + ε The values of parameers α and β are unknown by he economercan. However, he economercan has observaons on a sample of 546 houses, and he can esmae by OLS equaon: Y = ˆ α + ˆ βx + u αˆ and βˆ are called he esmaes of α and β. They depend on he precse se of 546 observed houses. Wh anoher sample of 546 houses I would have go dfferen esmaes. As he sample of 546 houses s random, you could consder ha he esmaes are also random. However, mos esmaes compued on dfferen samples of houses wll be near he rue values of he parameers. I s a good dea o consder an esmae as he realsaon of a random varable, called an esmaor. An esmaor s funcon whch lnks a random sample of houses o wo values for parameers α and β. An esmae s he value aken by he esmaor for a specfc sample of houses. In he res of hs paragraph I wll only consder parameer β. I s he mos neresng parameer of he equaon because gves he sensvy of he prce of a house o he sze of s lo. However, he man reason for dong hs choce s ha I do no wan o complcae my noaons and explanaons by dealng wh wo parameers nsead of one. When you have compued an esmae, you know ha wll probably dffer from he rue value of he parameer. Thus, s clever o choose a sgnfcance level, for nsance 95%. Then, sascs allows us o compue an nerval cenred on βˆ such

Jean-Perre Laffargue Page 8 5/05/04 ha he rue value β has a probably of 95% o belong o hs nerval. Ths nerval s called a confdence nerval. The wder s, he more mprecse he esmaon s. For nsance, newspapers publsh such nervals wh her polcal polls surveys: he conservave pary wll ge beween 3% and 36% voes n nex elecon. Koop makes Mone Carlo smulaons pp. 58-6. He akes he rue equaon and he fxes he values of he parameers o α = 0, β =. Thus, he rue equaon he wll smulae s Y = X + ε. Then he chooses a sample of values for X, he makes random drawngs for he error erm ε (n E-Vews you have a command whch generaes random numbers), he compues Y and he draws he scaer dagram of X and Y. An economercan would observe he scaer dagram, bu would no know he rue values of he parameers. Insead he would ry o nfer hese values, ha s o compue esmaes, from he scaer dagram. Koop draws 4 scaer dagrams and we can noce on hem:. More observaons wll ncrease he accuracy of he esmaon.. Smaller errors (.e. a smaller varance of ε ) wll ncrease he accuracy of he esmaon. 3. A larger spread of values of X (ha s a larger varance of he explanaory varable) wll ncrease he accuracy of he esmaon. Ths s normal: f all he los had a sze beween 5000 square fee and 6000 square fee, esmang he effec of sze on prce would be dffcul. I advse you o look a he 4 scaer dagrams drawn by Koop. Economerc heory shows ha he confdence nerval of β s ˆ β s, ˆ β + s ) ( α b α b s he sandard devaon of he esmaor of β. We saw ha hs esmaor s a random varable ( depends on randomly chosen houses), and s sandard devaon measures he accuracy of he esmaon. Economerc heory gves he formula: s s b b = N ( N ) SSR = = N ( N )[( ( X = ( X X ) SSR / N X ). s b. To nerpre I wll wre a b dfferenly: ) / N]. s b decreases, ha s he accuracy of he esmaon ncreases, when: ) N, ha s he number of observaon ncreases ; N ) SSR / N, ha s he varably of he error erms decreases ; 3) [( ( X X ) ) / N], = ha s he varably of he explanaory varable ncreases. Thus, hs formula s conssen wh he four scaer dagrams drawn by Koop. α s he sgnfcance level, for nsance 95%. If he error erm s normally dsrbued, α s a value, whch can be found n a Suden sascal able. I depends on α, of course, bu also on he number of observaons N. However excep for very low values of N, α does no change much wh N, and s value can be found n a normal dsrbuon sascal able. Nowadays, nobody looks a sascal ables: he compuer looks a hem nsead. However, 0 years ago sudens had o learn how o use hese ables, whch was a b cumbersome.

Jean-Perre Laffargue Page 9 5/05/04 Wha happens f he error erm s no normally dsrbued, ha s f s dsrbuon ncludes a fa al or f here are oulers. Economerc heory proves ha he prevous resuls are sll rue f he number of observaon s large. however, large has ambguous meanng: does large mean 00 observaons, 500 or 5000? The answer o hs queson wll make a bg dfference for he praconer. Fa als are bad 3, bu oulers are very bad. Ths explans why I spen some me on he concep of robusness n chaper. I wll come back o hs concep a b laer. In hs paragraph, I have deal unl now wh he esmaons of parameers. Now, I wll consder esng. Does he prce of a house really depend on he sze of he lo where s locaed? If you fnd he queson slly, does deforesaon s really sensve o populaon densy? Afer all could be only sensve o he greed of foregn capalss and he locals could chersh her foress. Or n a more general way s Y sensve o movemens of X, or does β dffer or no from 0. I wll call he hypohess: β = 0 he null hypohess. If s rue, he explanaory varable has no effec on he explaned varable. β 0 s he alernave hypohess. If s rue, changes n he explanaory varable affec he dependen varable. I wll es he null hypohess agans he alernave hypohess. Sascans are pessmsc people. Eher, he evdence gven by he observed daa conradcs he null hypohess, and he null hypohess s rejeced. Or, he evdence gven by he observed daa does no conradc he null hypohess, and he null hypohess s no rejeced. Ths concluson s dfferen from "he null hypohess s acceped". The null hypohess could be compleely wrong, bu he observaons could be nsuffcenly nformave o dscover ha. To process a es you need o compue an approprae es sasc. For our problem, $ β hs sascs s called a -sascs, or -rao, and s defned as =. If s small, I wll no rejec he null hypohess, f s large, I wll rejec he null hypohess. Now, wha do large and small mean? Economerc heory proves ha f he null hypohess s rue (and f he error erm s normally dsrbued or f he number of observaons s large), s dsrbued as a Suden dsrbuon (or as a normal dsrbuon f he number of observaons s large). Suden and normal dsrbuons are mplemened n economerc sofware. Thus, assume ha he value of he sascs s.36. If he null hypohess s rue ha s f: β = 0, he compuer can ell us ha he probably for havng a sasc equal or larger han.36 s.% (o be farly hones I dd no check hs las number because I have no sascal able wh me)..% s called he P-value of he es..% s a low probably. If he null hypohess was rue, would have been very unlkely o ge a sasc as hgh as.36. Thus, I wll rejec he null hypohess. A rule of humb s o rejec he null hypohess when he P-value s smaller han 5%, and no o rejec when s larger han 5%. 5% s called he sgnfcance level of he es. For some problems you could prefer choosng anoher sgnfcance level, for s b 3 Acually, f he als of he dsrbuon of he error erm are a lle oo fa, s sandard devaon, and even s mean are no defned, and usual economercs becomes nvald.

Jean-Perre Laffargue Page 0 5/05/04 nsance 0% or %. We can be a lle more precse. We saw above ha a larger number of observaons wll ncrease he accuracy of he esmaon. Thus, f he null hypohess s a lle wrong, ha s f he explanaory varable has a weak effec on he explaned varable, he P-value of he es wll be hgh f he sze of he sample s small, bu very low f hs sze s large. Thus, wll be wse o ake a sgnfcance level, whch s hgh for a small sample sze (0%) and low for a large sample sze (%) 4. A rule of humb s o rejec he null hypohess when he P-value s smaller han 5%, and no o rejec when s larger han 5%. 5% s called he sgnfcance level of he es. For some problems you could prefer choosng anoher sgnfcance level, for nsance 0% or %. If he number of observaons s large enough,.96 s assocaed wh a P-value of 5%. Then, you wll rejec he null hypohess β = 0 f he sasc s larger han.96. Ths value ncreases when he number of observaon decreases. For nsance, wh 0 observaons,.09 mus be subsued for.96. The dfference beween hese wo numbers s ny. However, you mus remember ha he Suden es ress upon he assumpon ha he error erm follows a normal law (or oherwse ha he number of observaons s large). If you are a b magnave, you have already noced, ha you would have he same es f you compued he confdence nerval assocaed wh a sgnfcance level of 95% (whch s 00%-5%), and checked f 0 belonged o hs nerval. If 0 belongs o hs nerval, you wll no rejec he null hypohess. Oherwse you wll rejec. If you are a lle more han a b magnave, you wll have noced ha esng β = 0, s esng f he explanaory varable has an nfluence on he dependen varable,.e. f he square of he correlaon coeffcen of he regresson R s equal or no o 0. You could lke esng no he null hypohess: β = 0, bu he null hypohess: β = c, where c s a nonzero number whch you seleced. The es of hs new null hypohess $ β c proceeds as before, excep ha you mus use he new sascs: =, nsead of: = $ β. Ths s a form of he Frsch-Waugh heorem. Ths heorem proves ha f s b you esmae by OLS equaon: Y = α + βx + ε, and equaon; Y cx = α + ( β c) X + ε, f you denoe by: $ α and $ β he esmaes of he frs equaon, he esmaes of he second equaon wll be: $ α and $ β c. Thus, esng f he coeffcen of he explanaory varable s equal o c n he frs equaon s equvalen o esng ha he coeffcen of he explanaory varable s equal o 0 n he second equaon. Then, we can apply he es developed before o he second equaon, and we ge he above formula for he es sascs. s b 4 To make hs senence clearer we would need o nroduce he conceps of ype II error and of power of a es.

Jean-Perre Laffargue Page 5/05/04 Robusness of regressons I wll consder he rue model: Y = α + βx + ε and s esmaon by OLS: Y = ˆ α + ˆ βx + u For observaon, resdual u s an esmae of he error erm ε. A basc assumpon for OLS s ha he sandard devaon of he error erm s he same for all SSR observaons. Is esmae s s =. However, he sandard devaon of N resduals changes wh observaons. For observaon s se( u ) = s h, wh: ( X X ) h = +. N ( X X ) h s called he ha sasc. The more dsan X s from s mean, he hgher h s. In a regresson, an ouler s an observaon wh a large resdual, compared o he resdual of mos oher observaons. An observaon wh hgh leverage s an observaon wh an explanaory varable, whch akes a value very dfferen from s mean, ha s an observaon wh a hgh ha sasc. An nfluenal observaon s such ha f you remove, he esmae of β wll change a lo. These conceps are relaed o one anoher, bu hey are dfferen. Loosely speakng, an observaon wh hgh leverage has he poenal of beng nfluenal. If hs observaon s an ouler hs poenal s realsed and he observaon s nfluenal. So, o be nfluenal an observaon mus be smulaneously wh hgh leverage and an ouler. When an observaon s nfluenal he resuls of he esmaon of he regresson depend a a srong exen on hs observaon. So, hey become fragle: you would lke he resuls of your esmaon o be almos nsensve o he he deleon of any arbrary se of a small number of observaons To denfy an ouler we mus compue he sudenzed resduals. To do ha, we wll dvde each resdual by s sandard devaon. However, as he sandard devaon of he error erm s sensve o oulers, n he formula gvng he sandard devaon of resdual, we wll use an esmaor of he sandard devaon of he error erm, whch does no use hs resdual, le be s nsead of s. Then, we ge he expresson: u = s h There exss a able for hs sascs, so we can es f u sgnfcanly dffers from 0. The crcal values of hs able are hgher han for a Suden able. I s easy o undersand why. Le us assume ha I am neresed by he dfferen problem: s observaon an aberraon n he sample? Then, I wll esmae he equaon afer havng added a new explanaory varable, whch s a dummy varable wh a value equal o 0 excep for observaon where s equal o. Then, I wll compue he Suden- sascs of he coeffcen of he dummy varable and I can noce ha s exacly equal o he sudenzed resdual for hs observaon:. I can compare hs sasc o he crcal value of a Suden able, whch s of he order of.96. The null hypohess s ha observaon s no an ouler. However, n hs paragraph he