Efficiency Analysis Working Papers

Efficiency Analysis Working Papers WP-EA-11 Key Parameters and Efficiency of Mexican Production Are there still Differences between the North and the South? An Application of Nested and Stochastic Frontier Panel Data Models Frauke Braun and Astrid Cullmann 007 Dresden University of Technology Chair of Energy Economics and Public Sector Management

Key Parameters and Efficiency of Mexican Production Are there still Differences between the North and the South? An Application of Nested and Stochastic Frontier Panel Data Models Frauke Braun Deutsches Institut für Wirtschaftsforschung, Berlin and Astrid Cullmann Deutsches Institut für Wirtschaftsforschung, Berlin ABSTRACT This study explores the prevalence and nature of the North-South divide for the Mexican manufacturing production across sub-national regions. We use a unique panel-data set of aggregate municipality-level data from the manufacturing sector. Firstly, we apply panel methods to estimate regional production functions and to analyze production characteristics and scale economies. Subsequently, we use stochastic frontier analysis methods to test for productivity and efficiency differences in manufacturing throughout the country. Our results suggest that the economic structure and productivity of southern Mexico is still considerably different from the North of the country. Remarkably, efficiency parameters from both traditional panel and stochastic frontier estimations vary strongly within states, indicating that islands of excellence prevail in otherwise highly inefficient and lagging states. Keywords: Mexico, Manufacturing, Efficiency Analysis Stochastic Frontier Analysis, Panel Data Models JEL classifications: C3, C14, R30 We have benefited from discussions with Peter Haan, Somik Lall, Marisela Montulio-Munoz and Viktor Steiner. Findings, interpretation and conclusions expressed herein do not necessarily reflect the views of the Board of Executive Directors of the World Bank or the governments they represent. Corresponding author: Frauke Braun DIW Berlin Mohrenstrasse 58 10117 Berlin fbraun@diw.de. 1

1) Introduction Mexico represents an interesting and especially relevant case for spatial analysis due to its significant and still prevailing regional disparity. The most pronounced lines of divide run between the urban and rural and even more so between the northern and southern areas. North-South disparity is not an uncommon phenomenon, but Mexico is outstanding due to its unique location as a still developing country neighboring the world s largest economy, the U.S.. The divide manifests itself in widespread poverty, rudimentary infrastructure as for example insufficient electricity access and in underdeveloped activities of the productive sector. To illustrate, key socioeconomic and economic indicators in the North such as the HDI resemble those of the average OECD counterparts, whereas the southern conditions those of the LDCs (OECD 004a). Facing the prevalent regional disparity, regional policies for a more equitable development are being pursued, but are mostly biased towards the urban and developed areas. The Mexican economy underwent turbulent developments during the past twenty years like the major currency crises of 198 or 1994/1995 which put severe strains on the development of this newly industrializing country. Deteriorating exchange rates, high inflation rates and subsequent insufficient access to credits, sluggish domestic and export demand hampered the development capabilities. The NAFTA accession promoted Mexico s trade and especially manufacturing. Composition of exports changed remarkably from an average manufacturing share of 9% from 198 to 1996 (Statistics Canada) to 76.9% in 005 (WTO). The bulk of these exports as e.g. automotive parts and electronic equipment are directed to the U.S. These production sites are primarily located close to this major trading partner and are typically maquiladoras that have been established in duty-free zones to assemble semi-finished inputs and to re-export the final goods. Few maquiladoras exist in the centre or even the south, where the typical business is rather a micro-enterprise employing the owner or few not re-numerated family members (OECD 1997). Technological standards have been low for example only.5% of all firms were using a state of the art technology in 199 (OECD 1997). Against this background, we consider it a meaningful and relevant approach to empirically explore the characteristics and performance of the manufacturing sector at the regional level. We build our analysis on a unique panel comprising the majority of Mexican municipalities (066 out of 45) and the years 1989, 1999

and 004. It offers the opportunity to explore the hypothesis of a pronounced North-South divide in production and efficiency characteristics of the manufacturing sector at regional level. We employ different parametric panel-techniques as error component specifications as well as parametric stochastic efficiency analysis. The outline of the paper is as follows: Section reviews central underpinnings of production theory, followed by a presentation of the econometric framework i.e. traditional panel methods like fixed effects and stochastic frontier analysis (SFA). Subsequently, section 3 introduces and surveys the database. Section 4 summarizes the most prominent empirical findings and highlights important conclusions. Last section 5 concludes. ) Theoretical Background and Econometric Specifications.1 Production Economics The technological possibilities of firms and industries can be summarized by means of production functions which represent the technical relationship between the level of inputs and the resulting level of outputs. 1 An econometric production function estimation from observed input output combinations therefore determines the average level of outputs that can be produced from a given level of inputs (Schmidt 1986). Different algebraic forms describe the technology of the industry. The most frequently used in empirical application are the Cobb-Douglas and the Translog which depend on different assumptions regarding returns to scale and substitution elasticities. The Translog function is defined by a second order (all cross-terms included) loglinear form and represents a relatively flexible functional form, as it does not impose assumptions about constant elasticities of production nor elasticities of substitution between inputs (see Coelli 005). 3 The multiple-input Translog production function in a general form is defined as: 1 The principal properties of production functions that underpin the economic analysis are non-negativity, weak essentiality, non-decreasing and concavity in the different inputs (for a detailed mathematical analysis on production function characteristics see Coelli 005). Among the most important are the Linear, the Quadratic, the Normalized Quadratic, the Generalized Leontief and the Constant Elasticity of Substitution (CES) function. 3 It thus allows the data to indicate the actual curvature of the function, rather than imposing a priori assumptions. 3

1 ln y = β β K K M 0 + β k lnxk + k = 1 k= 1 m= km ln x k ln x m Where ln y represents the output in a log form, ln xk represents the different inputs, ln xk ln xm the different squared and cross terms; β 0 the intercept or the constant term - interpreted as efficiency parameters (Coelli et al. 005) - andβ k and βkms are the parameters to be estimated. The Cobb-Douglas production function is characterized by more restrictive assumptions regarding returns to scale and the elasticity of substitution. The elasticity of substitution has a constant value of one - i.e. the functional form assumption imposes a fixed degree of substitutability on all inputs - and the elasticity of production is constant for all inputs. Therefore the Cobb-Douglas function can be expressed by: K ln y = β + β ln x 0 k k k= 1 The variables have the same definition as described previously. As one can see, the Cobb-Douglas is a special case of the Translog production function for allβ km being zero. We therefore start our analysis with an estimation of the Translog functional form and utilize in a second step the restrictions for the Cobb-Douglas functional form. 4. Traditional Panel Data Models We only present the econometric specification of the Translog function; assumptions and reasoning are equivalently valid and applicable for the Cobb-Douglas function as it is nested in the Translog function as shown above. Table 1 gives an overview of the estimation specifications. The basic relation is as follows: 4 The fact that we dispose of a large number of observations, without degrees of freedom problem, allows us to start with the estimation of the Translog including a large number of parameters to estimate. 4

y = β + β k + β l + 0.5β l + 0.5β k + β l k + ε, j = 1,...n, t = 1,..3 (1) jt 0 1 jt jt 3 jt 4 jt 5 jt jt jt j refers to the respective municipality with n equal to 066 and t covering the years 1989, 1999 and 004, is the natural logarithm of (real) gross value added, k jt the logarithm of (real) total inputs and l it the logarithm of the number of employees (see Section 3 for a detailed presentation of the selected variables). The constant β 0 is interpreted as a productivity parameter and the coefficients β1 and β as output elasticities with respect to total inputs and labor respectively. The first step is a pooled LS-estimation of (1) with the municipalities as the unit of interest. Each assumed to have zero mean and to be identically and independently distributed (over time and municipalities): jt ~ IID ( 0, ε ) ε σ. Estimates are unbiased and efficient under these classic assumptions. To capture the information structure of a panel, we start with a one-way error component model for the disturbances, which is specified as follows: ν jt = µ j + ε jt The ε jt s resemble the disturbances of a classic regression model, i.e. they vary over time and units j. Error y jt ε jt is component υ j is constant over time t and affects only cross-sectional unit j. Depending on further assumptions about this municipality-specific effect, two different regression approaches arise, the fixed effects (FE) and random effects model (RE). A LS estimation of (1) may be challenged by unobserved heterogeneity across municipalities which is correlated with the regressors - the FE model would nevertheless be consistent in this case. Our production function specified as a FE model reads as follows: y = β k + β l + 0.5β l + 0.5β k + β l k + υ + ε () jt 1 jt jt 3 jt 4 jt 5 jt jt j jt 5

ε jt is the idiosyncratic disturbance as described before, and υ j is assumed to be uncorrelated with covariates and thus allowing a consistent estimation of it as a parameter. Estimators are obtained by a least squares dummy variable estimation (LSDV). 5 The RE approach differs from the FE approach by regarding the effect υ j to be random. It explores unobserved effects in the error variances and assumes that the municipality-specific effect with regressors over time. µ j is not correlated y = β + β k + β l + 0.5β l + 0.5β k + β l k + υ + ε (3) jt 0 1 jt jt 3 jt 4 jt 5 jt jt j jt Random effects υ j are assumed to be identically and independently distributed with ( 0, συ) and the ε jt s are as well IID with ( 0, σ ε ). Both are mutually independent and as well independent of regressors across all units and over time. LS estimation of (3) is consistent, however not efficient, as it does not account for the fact that the υ jt s are not IID. 6 Generalized least squares are an efficient estimation technique, respectively feasible FGLS if - as in our case - the variance components σ ε and σ υ are unknown. FGLS estimates these consistently and transforms the data by accounting for this estimated covariance structure (see Balestra and Nerlove 1966). The subsequent LS estimation run with these transformed variables results in asymptotically efficient estimators for t or n approaching infinity (Baltagi 005)..3 The Unbalanced Nested Error Component Model An interesting feature of our dataset is its inherent natural grouping since each municipality is uniquely subordinated to one of the 3 states of Mexico (see Table 3). Facing this nested structure, we regard it as likely that individual effects are associated with both state- and municipality-level. To illustrate, unobserved factors such as education and consequently quality of labor can be considered as municipality-specific whereas quality of highway infrastructure as state-specific, and both exert an influence on the variables (capital-) and labor input. To account for this structure, we keep our basic Translog respectively Cobb- 5 Note that the overall constant is no longer present in () to avoid perfect collinearity. 6 For details on the covariance structure see Baltagi 005. 6

Douglas specifications, but we adopt a single-nested error components model as suggested by Baltagi et al. 001. It follows then: y = β + β k + β l + 0.5β k + 0.5β l + β k l + µ + υ + ε, i=1,...3, j=1,...n, t=1,...3 (4) ijt 0 1 jit ijt 3 ijt 4 ijt 5 jit ijt i ij ijt i The model has one time-series dimension t, but two cross-section dimensions; index j refers to the municipality as before but nested in state i. The model is appropriate for our panel which is unbalanced in the sense that the number of municipalities n i is different in each state i (see Table 3). It utilizes a two-way error components specification as follows: µ i represents the unobserved specific effect of state i and is assumed to be IID (0, σ ), υ ij the effect of municipality j in state i, assumed to be IID (0, σ ) and εijt is the µ remainder disturbance, again IID (0, σ ). All three error components are further mutually independent from ε each other. The distribution of the random state- and municipality-specific effects plays a role through the estimates of the variance components. The single-nested structure model is consistently and efficiently estimated with υ restricted maximum likelihood (REML). 7 Monte-Carlo studies by Baltagi et al.001 show that REMLestimators perform specifically well in estimation of variance components, but slightly less so in regression coefficients. Being aware of this caveat, we have however preferred REML to other methods like ANOVA since econometric packages like STATA support REML..4 Efficiency and Productivity Analysis We further want to relate our finding with regard to the production characteristics to an efficiency analysis of the different municipalities in order to get a first insight of the ranking which might be of interest for the policy making in Mexico. We want to analyze on municipality-level to figure out if larger municipalities in the North operate more efficient and if efficiency and productivity changed in Mexico over time. We herby focus on different models of the stochastic frontier analysis (SFA). 7 This estimation technique is based on partitioning the likelihood function and maximizing only the likelihood function which contains just variance components and no regression coefficients (for further details see Patterson and Thompson 1971). 7

The SFA is a parametric approach for frontier estimation able to differentiate between efficient and less efficient decision making units, in our case the municipalities. 8 Within this approach we assume a given functional form of the relationship between inputs and outputs and estimate the unknown parameters of the function by maximum likelihood techniques. Contrary to the ordinary least squares (OLS) regression, the stochastic frontier model decomposes the residuals into two terms, a symmetric component representing statistical noise, and an asymmetric component representing inefficiency (see Greene 004). 9 We now look at different SFA model specifications, all summarized in Table. Aigner et al. (1977) The most general formulation, proposed by Aigner et al. 1977 is as follows (see Greene 004): y = β 'x + v u, u = U U ~ N[0, σ ] v ~ N[0, σ ] u v (9) where x represents the set of explanatory variables, (inputs in the case of a production frontier), y the observed production of a firm; u represents the nonnegative random variable associated with inefficiency, and v the symmetric random error accounting for noise. For the noise component v it is assumed that they are independently and identically distributed normal random variables with zero means and variances. As the model is usually specified in natural logs, the inefficiency term u can be interpreted as the percentage deviation of observed performance y from the unit s own frontier performance (see Greene 00). 10 Stochastic frontier analysis allows the computation of efficiencies of the individual decision units or the whole industry. A common measure of technical efficiency is the ratio of the observed output to the corresponding stochastic frontier output (see Coelli et al. 005). In a general form, for both approaches, relative to the production frontier, the measures of technical efficiency TE are defined as: 8 Other parametric non stochastic approaches are the corrected ordinary least squares (COLS) and the modified ordinary least squares (MOLS). 9 The theory of stochastic frontier production functions, was originally proposed by Aigner et al. 1977 as well as Meeusen and van den Broeck 1977. 10 A large number of variants of the stochastic frontier model with regard to the distributional specifications of the inefficiency u have been proposed in the literature. In addition to the half normal distribution of u there are three further common alternatives: the truncated normal (see Stevenson 1980), the exponential and the gamma model (see Greene 1990). An extensive survey of the different models can be found in Kumbhakar and Lovell 000 who also provide the likelihood functions for the different models for estimation purposes. 8

TE = E( y u, x ) / E( y u = 0, x) = Exp( u) (10) where E is the conditional expectation (see Coelli et al. 1996). TE takes a value between zero and one and indicates the observed output of the jth unit relative to the output which could be produced by a fully efficient unit using the same input vector (production function approach). The above measures of technical efficiency rely upon the predicted value of the unobservable u (see Coelli et al. 005). It is determined by means of conditional expectations of the functions of u, conditional upon the observed value of the whole error term: ν u 11 Before selecting a specific model, analysts have to make an initial choice between the two most widely used functional forms: Cobb-Douglas function and Translog function. Using the Translog functional form yields the stochastic frontier production function in the following form: (11) ln y = β + β ln x + β ln x ln x + v u j 0 b jb bk bj kj j j b= 1 b k k = 1 where v j are random variables assumed to be IID with N(0,σ v ), independent of u j ; u j are non-negative random variables usually assumed to be half normal distributed (IID N(0,σ U ) ), thereby accounting for individual technical inefficiency. The other variables are defined as before. Pitt and Lee (1981) The discussion so far has not taken into account the special structure of panel data models. This will be introduced in the subsequent discussion. There exists a wide variety of literature with different approaches concerning the structure of the inefficiency effects in time. An overview is given in Kumbhakar and Lovell 000. Pitt and Lee 1981 have used panel data models within the stochastic frontier analysis for the first time, 11 Jondrow et al. 198 and Battese and Coelli 199 derive the conditional predictor of u in detail. 9

defining the panel data random effect as inefficiency, and assuming a half normal or an exponential distribution for the inefficiency terms u j. 1 The model can be defined as: y = α + β 'x + ν u v N[0, ] u U, U N[0, ] jt jt jt j jt = σv j = j j = σu (1) This represents a direct extension of the cross section variant outlined above (see Greene 007). We see that the inefficiency component is assumed to be time-invariant; therefore the results indicate the average technical efficiency of firms across the observation period. Battese and Coelli (199) Battese and Coelli 199 have introduced a random effects model and relaxed the assumption of time invariance of the inefficiencies. They proposed a deterministic function f (t) as an extension that determines how technical inefficiency varies over time u jt = f (t) u j (see Coelli 005): f (t) = exp[ η(t T)]s (13) where η is an unknown parameter to be estimated. 13 The main criticism to this approach is that the variation of efficiency with time is considered as a deterministic function. This involves the unrealistic assumption that it is equally defined for all firms. Greene (005) Recent developments for the traditional fixed and random effects models for stochastic frontiers presented above (for a survey see Greene 005 and 007). They deal with the following shortcomings of the earlier models: First of all, efficiency estimation in the presented stochastic frontier models typically assumes that the underlying production technology is the same for all units. There might, however, be unobserved differences 1 Schmidt and Sickles 1984 applied the fixed effects model within the stochastic frontier framework, defining the fixed effect as inefficiency. 13 The log likelihood function for estimation purposes is presented in Battese and Coelli 199. 10

in technologies that would be inappropriately labeled as inefficiency if such variations in technology are not taken into account. Greene 005 summarizes that the models fail to distinguish between cross individual heterogeneity and inefficiency, because fixed and random effects estimators force any time-invariant cross unit heterogeneity into the same term that is used to capture the inefficiency. Another shortcoming is that the conventional estimators assume inefficiency constant over time. Battese and Coelli 199 relax indeed the time invariance, but the random component is still time-invariant, which remains a substantive and detrimental restriction. 14 Greene (004, 005) provides new evidence to stochastic frontier analysis and extends the traditional random effects panel data models to the true random effects stochastic frontier model defined as y = w + α + β 'x + ν u v N[0, ] u U, U N[0, ] w N[0, ] jz j jt jt jt jt = σv jt = jt jt = σu j = σw (14) where w j, the time-invariant random effect, the random constant term, representing the cross section latent heterogeneity. It is again a normal-half normal stochastic frontier model. 15 The true random effects model can be seen as a special case of the random parameters model, where the only random parameter in the model is the constant term. The model can be estimated by means of simulated maximum likelihood. For details on the estimation procedure and the identification problem mentioned previously see Greene (004, 005 and 007). 3) Data Description We have created a balanced manufacturing panel based on regional production data offered by the Mexican national statistical office Instituto Nacional de Estadística Geografía e Informática (INEGI). The original data stems from the economic census and is provided for the years 1989, 1994, 1999 and 004. It covers all 14 Recall that within the Battese and Coelli 199 specification the heterogeneity was placed into the inefficiency distribution. 15 Greene 007 points out that it seems to be a model with three part disturbances which is certainly inestimable. Greene 007 shows that this is not correct, it is a model with a time traditional random effect, with a further characteristic that the time-varying disturbance is not normally distributed. 11

Mexican municipalities - between 48 municipalities in 1989 to 45 in 004, all of them organized into 3 states (see Table 3). Our primary unit of observation in the panel is going to be the municipality. The economic census reports only few of the variables on aggregate level consistently over the years. We have at least obtained the most relevant for our research question, which are gross value added Y, total inputs K jt and number of persons employed L jt. These variables are commonly used in conventional and estimations of regional production functions, see e.g. Coelli et al. 005 or Hsing 1996. Note that the panel completely excludes economic activity or production entities of the informal sector which play a substantial role in Mexico. Monetary variables as value added and total inputs are provided in U.S. Dollar instead of Mexican Peso in 1994 which is likely to be due to the Mexican economic crisis ( Peso-crisis ). We converted the two affected data series into Mexican Peso with the average yearly exchange rate of 1994 as provided by the Mexican central bank (Banxico). As it is usually done with sectoral panel data, the monetary series value added and total inputs were deflated by appropriate producer price indices (PPI) here Banxico PPI for finished goods excluding oil to capture these variables in their real terms (see Salgado-Banda and Bernal- Verdugo 007). The adjustments were undertaken with respect to the base period December 003. Introspection of descriptive statistics and preliminary regression results indicated that the year 1994 is jt problematic, as for example the mean of K jt is unusually high (see Table 4) especially for a period of economic downturn. This can be related to general data insufficiencies in the inflation-inflicted year 1994 and subsequent problems with the conversion to Peso values or deflationing. For all further analyses we therefore employ a balanced panel excluding the year 1994. Additionally, variables Y, L and K have been mean-corrected and transformed with natural logarithm jt jt jt to minimize the leverage of outliers and for ease of interpretation as straightforward elasticities. They are henceforth denoted as y jt, k jt and l jt. We coped with zero, missing or implausible observations such as negative number of employees by deleting these observations. Henceforth, we base our analysis on a panel of 066 different municipalities and which is balanced in the sense that each municipality is available for the years 1989, 1999 and 004, thus amounting to 6198 observations altogether (see Table 4). 1

4) Empirical Results 4.1 Traditional Panel Data and Nested Error Component Models Traditional Panel Data Models We are starting the discussion of the results from the least squares and error components estimations for the Translog (eq. (1) to (4), Table 1) and Cobb-Douglas case (eq. (5)-(8), Table 1). Estimation Results are shown in Table 5 and 6. A comparison of the estimated elasticities is unfortunately rarely feasible as equivalent studies for Mexico have only been conducted on firm-level (see Lopez-Cordova and Mesquita-Moreira 00 or Salgado-Banda and Bernal-Verdugo 007) or regional approaches only for developed countries (see e.g. Hsing 1996). LS estimation of the Translog production function (eq.(1)) yields significant and positive estimates of the coefficients - with the exception of the insignificant cross term. The elasticity of value added with respect to input labor is 0.395 and to (capital-) input is 0.740, when evaluated at the mean of these variables. The estimates of the squared terms are in both cases positive (0.030 for k jt and 0.06 for l jt ), the cross term is however insignificant. We test additionally whether the production characteristics exhibit constant returns to scale, i.e. the joint hypothesis that the coefficients of squared and cross terms are insignificant and that those of k jt and l jt add up to one. This hypothesis is rejected and there is clear evidence for variable respectively increasing returns to scale (F(5,619): 131.84). Thus, one has to account for the level of the input in consideration and the level of all other inputs to assess the overall impact of a change in an input. The validity of these results may be challenged by the possibility that the choice of the two observed inputs k jt and l jt is related to unobserved factors. To tackle this issue and avoid biased LS-estimates, we control for these effects by estimating a FE model (see Mundlak 1961). Introspection of the FE results shows that individual heterogeneity does indeed play a role in our data. The R-squared overall (0.951) and R-squared within (0.748) are quite distinct from each other first evidence in favor of the FE as an adequate approach. All coefficients are individually as well as jointly significant on the 1%-level (F(5,417): 454.97). Controlling for individual-specific effects does not lead to a drop in the estimates as frequently occurs, but rather a rise. This can be explained by either measurement errors or a negative correlation between 13

unobserved heterogeneity and the covariates. 16 This is not an unlikely phenomenon, one can think of factors like ongoing insufficient electricity access that negatively affect the choice of inputs. Interpretations of the underlying production characteristics resemble those from the LS-estimation, i.e. we find a positive, nonlinear relationship between value added and the two inputs. The elasticity with respect to k jt (0.759) is again clearly higher than for labor l jt (0.451). Both squared terms are positive and the hypothesis for constant returns to scale is again rejected (F(4,417): 64.99). The cross term is negative (-0.039) indicating that (capital-) input and labor are substitutes. Municipality-specific effects are jointly significant, thus providing evidence that unobserved heterogeneity prevails in the data (F(065, 417): 1.73). In addition, the estimated within-municipality standard deviation is slightly smaller than the between-municipality deviation (0.59 vs. 0.588); roughly half of the total variation of value added can thus be explained by variation within municipalities ( ρ : 0.447). Turning to the RE estimation, we observe that the size of RE coefficients is between the LS and FE results which occurs when the variance of the municipality-specific effects and of the idiosyncratic shocks are similar in size. To inquire which specification is more appropriate - FE or RE -, we revert to the reported correlation of FE residuals (respectively predicted municipality-effects) and the predicted values of value added which are clearly different from zero ( 0.336 ), a first reference against the RE specification. We utilize a Hausman 1978 test for orthogonality of random effects and covariates. The null hypothesis is clearly rejected, and we conclude that correlation between unobserved effects and inputs exists, thus favoring the FE specification over RE (chi-squared (5): 41.70). The FE specification regards regional effects as constant over the panel-period of fifteen years which may seem less persuading against the background of this dynamic and even disruptive period for a newly-industrializing country. An alternative approach for future research would therefore be to use a dynamic model. Nested Error Component Models We additionally adopt a single-nested error components model. The sign of the estimated coefficients resemble those of the RE estimation and thus do not offer further insights for the interpretation of coefficients. 16 Incidence of measurement errors is e.g. related to misreported data. As we have no further evidence of the nature of this error, we do not further discuss this issue or potential remedies such as IV. 14

The estimated variance of the state-effects ( σ µ : 0.149 ) is lower than the municipality-effect ( σ υ : 0.17 ) which is in turn clearly smaller than the idiosyncratic variance ( σ ε : 0.601 ). Apart from estimating the variance components and coefficients itself, we are additionally interested in obtaining the municipality- and state-specific intercepts with best linear unbiased predictions (BLUPs) in order to explore the actual regional differences. The ECM specification is especially suitable for our research question as it is able to reveal the different relevance of unobserved factors depending on the respective regional level. Figure 1 and show systematic differences in size and sign of the intercepts between the states. The intercepts commonly interpreted as efficiency parameters - at either state- or municipality-level are both negative for the southern states as Yucatan or Oxaca, but large and positive for the northern States Aquascalientes, Baja California Sur or Chihuahua. Typically, the intercepts on state-level are much higher than for the municipality. Figure 3 illustrates the best linear unbiased predictions of the municipality-specific effects from the FE model. It displays a similar ranking of the states as the ECM specification (see Figure 1). 17 Interestingly enough, both FE and ECM predictions provide evidence that the dispersion of the size of municipality-effects within one state is large. To illustrate, both models rank the municipality El Barrio de la Soledad in the state of Oxaca among the best five performing municipalities though the state Oxaca ranks among the worst state. El Barrio de la Soledad hosts two cement cooperations that are very successful and well-known which operate in an otherwise typical agricultural region. Careful introspection of the regional effects thus indicates that North-South is clearly manifest, but that variation within the states is large a phenomenon that demonstrates that municipalities can flourish in otherwise desolate states. The same estimation procedures as discussed above have also been conducted with the Cobb-Douglas production function (for specification see Table 1). The estimated coefficients are generally lower than with the Translog function and in all cases the hypothesis of constant returns to scale is rejected (Table 6). The Hausman test again favors the FE specification over RE (chi-squared(): 337.8). To discriminate between the two production functions, we employed a LR test for the nested ECM case. 18 We can clearly reject the hypothesis that the log-likelihoods evaluated at the restricted (Cobb-Douglas) and 17 In order to conduct predictions of a FE model, we have conducted a regression by explicitly including dummies in the set of regressors. Subsequently, linear predictions of the municipality-effects were possible with STATA. Figure 3 then shows the average of the municipality-effects in the respective state. 18 To conduct the LR test, the ECM specification had to be re-estimated with ML. 15

unrestricted (Translog) model are not significantly different from zero (see Table 6). For the three previous specifications we tested the joint hypothesis that squared and crossed terms equal zero. In all cases there was clear evidence against the Cobb-Douglas specification. We can thus conclude that the underlying technology is not characterized by constant returns to scale over all levels of input, and favor the more flexible Translog model over the restrictive Cobb-Douglas model. 4. Stochastic Frontier Analysis We now turn to the results of the parametric stochastic frontier analysis in order to analyze the efficiency differential on municipality- and state-level. 19 With the SFA results we want to test the hypothesis if there are still large and sustained differences between the municipalities in the North and the South of the country and if technical efficiency in the manufacturing sector changed over time in Mexico. As pointed out in Table, we start with the estimation of a pooled cross section stochastic frontier according to the Aigner et al. 1977 specification. In a second step we deal explicitly with panel data models for SFA, starting with the Pitt and Lee Model 1981 assuming that the inefficiencies do not vary over time. In order to look how efficiency changed over time we also analyze the Battese and Coelli 199 specification. The last model, the true random effects model (Greene 005), helps us to distinguish municipality-specific inefficiency from unobserved heterogeneity. The estimation results are summarized in Tables 7 and 8. Cross section SFA (Aigner et al. 1977) In a first instance we assume that each observation point in the whole period is independent, therefore we estimate a cross section SFA for the manufacturing sector. The coefficients estimates for the normal-half normal Model are shown in Table 7. The Kernel density distribution of the estimated inefficiencies is outlined in Table 7. In the normal-half normal Model estimates consist ofβ, the coefficients of the input variables, λ and σ and the usual set of diagnostic statistics for models fit by maximum likelihood. The estimated coefficients of labor and capital elasticities reflect the same trend as in the previous panel specifications. 19 We assumed for all SFA Models the more appropriate Translog specification, as shown in Section 4.1. Predicted technical inefficiencies were calculated according to Jondrow 1981 assuming different distributions for the technical inefficiencies. 16

For the compound error term we obtained variance parameters for λ = σu / σ ν = 0.87 and for σ = σ ν +σ u = 0.77. These results shows that there is positive variance within the technical inefficiencies. The statistics for the kernel distribution of the inefficiencies point out a mean of 0.400, a minimum of 0.08 and a maximum of.4. This shows huge differences in the productivity level of the different Mexican municipalities and a mean technical inefficiency level 0.40. Special interest of the analysis consists in estimating the individual efficiencies of the municipalities in the sample. 0 We summarize in the following the main characteristics of the individual technical efficiency scores, focusing on the hypothesis that larger municipalities in the north of the country operate in a more productive and more efficient way. Table 8summarizes the mean technical efficiencies and their standard deviations on state-level for the different SFA specifications. The mean technical efficiency level in the manufacturing sector is 0.677, this figure points out that on average the same output could be produced with only 67 per cent of the actual input. Special interest lies in the different performance levels of the states. Therefore we calculated the average of the technical efficiency on municipality-level for each state. It can be seen in Table 8 that Chihuahua, Aguascalientes, Sonora, Coahuila de Zaragoza and Baja California Sur are among the best performing states. All, apart from Aguascalientes, lie at the US border in the North of the country. This trend can be confirmed by all model variations of the SFA. When we consider the worst performing states, we find Chiapas, Nayarit, Veracruz de Ignacio de la Llave, Puebla, Distrito Federal, Oaxaca and Yucatán all in the southern part of the country. Thus large and sustained differences are still prevalent in the economic structure and performance of sub-national regions in Mexico, and southern municipalities still appears to suffer from a lack of technical efficiency in comparison to the North. This might be explained by the geographical closeness to the American boarder where the northern regions benefit from the connectivity to trans-border markets in the United States. A considerably different industrial structure of the South in comparison to the North might be another reason. Further, the South is dominated by micro firms with low-skilled employees; labor productivity is accordingly very low in the South. In addition the southern states are confronted with lower levels of transport connectivity in opposition to the northern regions. 0 As outlined in Section.4, under the assumption of a log-specification, the efficiency of the municipalities would be Efficiency = exp( u). 17

It is also important to know to which extent the technical efficiency scores vary in each region to figure out the disparity in each state. We calculated the standard deviation in each state and can observe that the states which feature a very low technical efficiency score feature a high standard deviation which indicates a high disparity in these states as e.g. in Oaxaca and Nayarit. Thus, there are also municipalities which operate in a more productive way. The disparity found is also reflected when we do not look at the average in each state, but at the individual scores on municipality-level. Within this framework we find that the most productive municipality is Malinaltepec in Guerrero, followed by San Bartolo Coyotepec, Guerrero, Cuautinchán, Zacualpan. This shows that there are municipalities performing indeed very well in Oaxaca. The less productive municipalities are Santo Domingo Teojomulco, San Antonio Tepetlapa, Tahmek, Quiriego and Yaxkukul. We now consider in more detail the size of the municipalities 1 and relate it to the level of technical efficiency for 1989 to 004. The technical efficiency scores for each year are represented in Figure 4 where the municipalities are ordered by size, starting with the largest municipality on the left hand side and ending up with the smallest one on the right hand side. One can see that the larger municipalities are on average more efficient. Calculating the mean efficiencies of the 50% largest in contrast to the 50% smallest ones confirms the result (0.70 vs. o.68 in 1989; 0.68 vs. 0.63 in 1999; 0.7 vs. 0.65 in 004). This validates scale inefficiencies in the production process due to increasing returns to scale found in the previous panel data analysis. Pitt and Lee (1981) We start with the approach by Pitt and Lee 1981 have used panel data models for the first time within the stochastic frontier analysis, defining the panel data random effect as inefficiency, and assuming a half normal distribution for the inefficiencies u j. This approach is characterized by the assumption of time invariance of the inefficiencies. It confirms the most and less efficient states found in the cross section analysis. Within this specification we obtain an average efficiency over the observation period for each municipality due to the 1 Size is in our analysis defined as the value added produced in each municipality, All results can be confirmed by model variations with regard to the exponential and gamma distribution of the technical inefficiencies. 18

assumption of time invariance of u j. The municipalities Guerrero, Huichapan, Barrio de la Soledad, El Cuautinchán, Fronteras and Apaxco are again among the most efficient. Battese and Coelli (199) Over an observation period of 15 years it seems to be problematic to assume time invariance of the technical inefficiencies. Thus we applied in a subsequent estimation the Battese and Coelli 199 Model. Estimation results for this SFA Model are presented in Table 7. Assuming time variance of the inefficiencies leads to a negative and significant estimate of η = 0.1700 (p-value = 0.00) as outlined in Table 7. This suggests an industry wide decrease of 1.7% in technical efficiency over the period 1989 to 004. Therefore, municipalityspecific technical efficiency estimates are decreasing. The average decrease might be due to the fact that the Mexican economy underwent turbulent developments in the past twenty years as outlined in Section 1 with one of the major currency crises in 1994/1995. 3 Greene 005 As mentioned, the main shortcoming of the above specified and estimated panel data models is that any unobserved time-invariant, but municipality-specific heterogeneity is considered as inefficiency. To overcome this problem, and to analyze if unobserved heterogeneity plays an important role in determining the efficiency we estimate the random effects models derived by Greene (004, 005) in a further step. We focus in our analysis on the random effects specification outlined in Section 3.3. 4 5 Estimation results show that the coefficients remain approximately the same (see Table 7). When we look at the individual efficiency scores on municipality-level we notice two main trends: first, the technical efficiency variation over time is not the same for all municipalities. In some municipalities it has been increasing over the years in others rather decreasing. This is an interesting feature of the underlying specification which is able to model the individual inefficiency change of the municipality. The Battese and Coelli 199 specification defines an industry wide improvement or decrease depending on the sign of the estimated η. This is a general deterministic function 3 Note however that the picture changes when we look at unit specific variation in time and not the average, as shown subsequently. 4 The simulated maximum likelihood estimates as well as the inefficiency predictions were obtained using LIMDEP Version 9.0 (Greene 007). 5 We assumed an exponential distribution for the inefficiencies. 19

which describes the average variation over time as all municipalities feature the same η. Now, we obtain a more precise municipality-specific trend of the performance changes in time. 6 Second, in comparison to the technical efficiencies obtained from the Pitt and Lee specification, the inefficiency estimates obtained from the true random specification are lower on average. In this form the is the inefficiency and it is time-varying, the latent heterogeneity is captured by the random effect w. Therefore, the empirical results show that in the Pitt and Lee specification the inefficiency term also contains all other time invariant unmeasured sources of heterogeneity (see Greene 007). In the true random effects model theses affects appears in w j and u jt picks up the inefficiency. We can conclude that the inefficiency estimates are sensitive to the specification of unobserved community specific heterogeneity, and therefore the inefficiency scores obtained from the traditional specifications (including unobserved environmental factors) most likely overstate the inefficiency of the municipalities. However, the overall trend found in the other models remains valid: small municipalities operate less efficient and the large states in the northern part of the country operate more efficient. u jt j 5) Conclusion In this paper, we have analyzed the nationwide key characteristics of production in the manufacturing sector in Mexico. Using municipality-level data from the national statistical office INEGI we test the hypothesis if there still are marked differences in the economic structure and productivity between the 3 states and subnational regions. We mainly focus on the differences between the northern and southern regions and states in Mexico. We use a balanced panel data set for the years 1989, 1999 and 004 and apply traditional panel data models such as fixed and random effects. Additionally, we use two panel data model extensions: the nested unbalanced error component model to capture the state and municipality heterogeneity and secondly different stochastic frontier models to figure out efficiency differentials on municipality-levels. We herby use recently 6 In all states we find a heterogeneous image, but within this framework we do not want to go into detail for the discussion and interpretation on municipality-level change within the years. 0

developed models by Greene 005 to distinguish unit-specific unobserved heterogeneity from technical inefficiency in order to capture the different operating conditions of the regions. First of all, our results indicate that the underlying technology in the manufacturing sector is best described by a Translog production function implying increasing returns to scale on municipality-level. Further analysis revealed that unobserved heterogeneity on regional level plays a role in our analysis and can be well accounted for with classic panel approaches as fixed effects. The nested panel data model explicitly accounts for heterogeneity at both regional levels: the state and the municipality. The model offers new and unique insights for the relation of the state and municipality-level in Mexico. Predictions of the regional-specific intercepts from the nested error component model show that sustained differences between the states prevail and primarily run along the North-South divide. The signs of the predicted municipality- and state-effects typically move into the same direction, but the state-effects are clearly higher indicating a prominent role for factors at state-level to explain lagging regions. Predicted municipality-effects from the FE specification have confirmed the North-South divide and imply a similar ranking of the states giving further support for the robustness of our results. 7 The different stochastic frontier models first of all confirm that the manufacturing sector is characterized by increasing returns to scale. All different model specifications reflect the same trend: the northern states operate more efficient in the manufacturing sector in comparison to the South of the country. Thus, there are still sustained differences in the economic structure, and southern municipalities still appear to suffer from a lack of technical efficiency in comparison to the North. We have tried to explain this phenomenon with the geographical closeness to the US boarder where the northern regions benefit from the connectivity to transborder markets in the United States, spillovers such as knowledge- or technology transfer and competitive pressures fostering efficiency. Further, the South is dominated by micro firms with low-skilled employees; one of the major concerns is here the low labor productivity in the South, considerably lower than national standards not only in the manufacturing sector. It is crucial to implement various types of skill upgrading and technology adoption programs in order to enhance productivity. Another explanation of the lower productivity level might be the low levels of infrastructure and transport connectivity in terms of accessibility and reliability in the southern states in opposition the northern regions. This implies the necessities of inter and 7 It does confirm the importance of regional effects, but is unlike the nested error component model not able to discriminate between the state and municipality-level. 1