Estimation of Efficiency with the Stochastic Frontier Cost. Function and Heteroscedasticity: A Monte Carlo Study

Estimation of Efficiency ith the Stochastic Frontier Cost Function and Heteroscedasticity: A Monte Carlo Study By Taeyoon Kim Graduate Student Oklahoma State Uniersity Department of Agricultural Economics Dr. Wade Brorsen Regents Professor Oklahoma State Uniersity Department of Agricultural Economics Dr. Philip Kenkel Professor Oklahoma State Uniersity Department of Agricultural Economics Selected Paper prepared for presentation at the American Agricultural Economics Association Annual Meeting, Orlando, Fl, July 7-9, 008. Copyright 008 by Taeyoon Kim, Dr. Wade Brorsen, and Dr. Philip Kenkel. All rights resered. Readers may make erbatim copies of this document for non-commercial purposes by any means, proided that this copyright notice appears on all such copies.

Estimation of Efficiency ith the Stochastic Frontier Cost Function and Heteroscedasticity: A Monte Carlo Study Abstract The obectie of this article is to address heteroscedasticity in the stochastic frontier cost function using aggregated data and erify it using a Monte Carlo study. We find that hen the translog form of a stochastic frontier cost function ith aggregated data is estimated, all explanatory ariables can inersely affect the ariation of error terms. Our Monte Carlo study shos that heteroscedasticity is only significant in the random effect and the unexplained error term not in the inefficiency error term. Also, it does not cause biases, hich is quite opposite of preious research. These are because our model is approximately defined by first order Taylor series around zero inefficiency area. But, disregarding heteroscedasticity causes the aerage inefficiency to be oerestimated hen the ariation of inefficiency term dominates the other error terms.

Estimation of Efficiency ith the Stochastic Frontier Cost Function and Heteroscedasticity: A Monte Carlo Study Introduction Since the adent of Farrell (957) efficiency indexes using a deterministic frontier function, efficiency measurements hae been consistently deeloped by researchers oer all industry. Aigner, Loell, and Schmidt (977) brought about the possibility that deiations from the frontier may arise because of random factors and proided the disturbance term as the sum of symmetric normal and half-normal random ariables. Meeusen and an den Broeck (977) also introduced the composed error hich distinguishes inefficiency from a statistical disturbance of randomness. Jondro et al. (98) suggested estimating inefficiency component for each obseration ith the stochastic frontier model so that rankings among obserations ere possible. In terms of addressing heteroscedasticity, Caudill and Ford (993) found the biases in the frontier estimation due to heteroscedasticity of a one-sided error and later Caudill, Ford, and Gropper (995) found that the rankings of firms by efficiency measures ere significantly affected by the correction for heteroscedasticity. These ere folloed by the suggestion from Schmidt (986) that a one-sided error can be associated ith factors

under the control of the firm hile the random component can be associated ith factors outside the control of the firm. Since firm-leel data are used in the frontier function and firms ary idely in size, size-related heteroscedasticity is inoled in a one-sided error. On the contrary, concerned ith heteroscedasticity only in a one-sided error, Hadri (999) suggested heteroscedasticity of both error terms ith the same data of Caudill, Ford, and Gropper (995). On the other hand, Dickens (990) shoed that aggregated data caused heteroscedasticity ith the size of group. This result can lead that small firms are less efficient hile large firms are more efficient hen the frontier (aerage) cost function ith aggregated data is estimated because the ariation is decreasing as the size of group increases. This can also induce the discussion that economies of size is also affected by heteroscedasticity. A translog cost function and maximum likelihood estimation (MLE) are idely used in preious research. Since the oint distribution of a symmetric normal and a truncated normal had been first deried by Weinstein (964), Aigner et al. (977) and Greene (980) described a method of estimating the frontier production model using a translog functional form and MLE.

This paper is about addressing heteroscedasticity using the translog form of a stochastic frontier cost function ith aggregated data and examining aerage efficiency measurement for both cases. We specify the aggregated model from the disaggregated model and take a natural log in order to use a translog cost function. To make the equation be simplified, first order Taylor series around the frontier area is applied. Then, e can see heteroscedasticity on error terms. A Monte Carlo study enables to erify it and compare efficiency measurement in the presence of heteroskedasticity ith that of disregarding heteroskedasticity. Theory Consider the folloing disaggregated model: () C = ( Xβ ) + u +, i=,..., n,,..., J, i i i = u ~ ii d N(0, ), ~ ii d N(0, ), co( u, ) = 0, u i i here C i is the cost of the ith output in the th firm, X is a ector of explanatory ariables including input prices and output, β is a ector of unknon parameters to be estimated, u is the random effect of the th firm, i is the unexplained portion of the cost of ith output in the th firm.

In a stochastic frontier cost function, the inefficiency is considered as the deiations from the frontier so that a one-sided error term is needed to represent that. Thus a stochastic frontier cost function can be defined as () C = Xβ ) + u + +, ~ ii d N(0, ), co( u, ) = 0, i ( i i co(, ) = 0, i here is the inefficiency and a one-sided error ith E ) = π and ( ar( ) = ( π ). Especially, π is knon as aerage inefficiency measurement by Aigner et al. (977). When being added oer all outputs ithin each firm, a (total) stochastic frontier cost function can be deried as n n (3) Ci = (Xβ ) i + n u + i+ n. i= i= i= n here n is the number of output in the th firm. Assuming cost functions for each firm hae same explanatory ariables, a (total) stochastic frontier cost function can be also expressed as (4) TC = n ( Xβ ). + n ( u +. + ), =,..., J,. ~ N(0, ), n here TC is the total cost for the th firm, and the dot is the common notation to denote that the ariable has been aeraged oer the corresponding index; outputs in this

case. Here, e can see that aggregated data cause heteroscedasticity ith outputs on error terms hen the (total) stochastic frontier cost function is estimated. A translog cost function can be usually used due to seeral coneniences such as including multiple outputs, calculating elasticities easily, adusting heteroscedasticity and etc. Let s take a natural log for the equation (4) and first order Talyor series of ( u + + ) ln Xβ around the mean of random errors and the frontier of ). +. inefficiency error such as u = 0, 0 and = 0 gies us folloing model:. = (5) lntc ln( ) + ln n + ( u + + ) Xβ... ( Xβ). The ariance of error terms is ( Xβ ). + u n +, hich means that explanatory ariables also affect heteroscedasticity on error terms. In other ords, the ariance of all error terms is inersely affected by all squares of explanatory terms. When letting e = +, the density function by Weinstein(964) is knon as * e * λ e (6) f ( e) = f F <, e< + here + λ=, =, * f and * F are the standard probability density function and the standard cumulatie density function, respectiely. Here, λ is an indicator of the relatie ariability of error terms. As Aigner et al. (977) mentioned it, λ 0 means 0 and/or, i.e. that inefficiency error is dominated by random error.

The log-likelihood function ith heteroscedasticity in equation (5) can be expressed as J J (7) ln( f ( e )) = ln = = f * e F * λ e here e ( u + + ) =. ( Xβ)., = u + +, ( ) Xβ. n λ =. + n u This enables to use maximum likelihood estimation ith heteroscedasticity for the stochastic frontier cost function. Data and procedures A Monte Carlo study can be used to examine heteroscedasticity of a stochastic frontier cost function on error terms. Based on equation (), our true model is assumed as (8) C i = ri + u + i +. here r i is the input price of the ith output in the th firm, the others are the same as preiously defined. Aggregation oer all outputs ill derie folloing model: n (9) Ci = TC = n r. + n ( u +. + ). i Taking a natural log and first order Taylor series around the mean of random errors and the frontier of inefficiency error result in the folloing model:

(0) ln TC ln( r. ) + ln n + ( u +. + ). r So, our stochastic frontier cost function can be defined as () ln TC = β 0+ β ln r. + β ln n + ( u +. + ), r here heteroscedasticity is incorporated into the ariances by assuming.. u ( ) + = δ. u u δ and = δ n ( r. ) ( r. ) +, true alues of the parameters are expected to be β = 0 0, β = β = δ u = δ = δ =. The input price is generated as r i ~ N(,4). The means of output is around 8.89 and the ariance of that is around. e assumed that there exist lots of small firms and a fe of large firms. In order to see the changes in a relatie ariability of error terms, e hae to indicators ( λ andλ ), i.e. first one has the same ariability and the last has more ariability in the inefficiency error, so that e can see ho much the aerage inefficiency changes as the ariability of inefficiency increases. Using NLMIXED in SAS ith 00 samples of 00 obserations, the stochastic frontier cost function ith heteroscedasticity and ithout heteroscedasticity is estimated. Outcomes are first compared ith expected alues to see ho much the model is different from the true model and then compared each other ith and ithout heteroscedasticity. The SAS code for output is int(5*exp(rannor(345)))+.

Results Table and Table sho the estimation results for the stochastic frontier cost function ith 00 samples of 00 obserations. We estimated β ' s for the frontier function and δ ' s for ariance equation, and ariances and aerage inefficiencies. Second column has expected alues for each parameter. Third column is the results ith restricting heteroscedasticity and fourth column is the results ithout heterescedasticity. Oerall, the estimated parameters are close to the expected alues except the intercept and theδ. For the intercept, it is interpreted as remainders by first order Taylor series. For theδ, maximum alue as imposed to be 0. because of the conergence. Later, it affects the ariance of the inefficiency error to be smaller than hat e expected. First, looking at the parameters from the ariance equation indicates that there exists heteroscedasticity in the translog form of the stochastic frontier cost function ith aggregated data as expected. Second, comparing the results ith heteroscedasticity and ithout heteroscedasticity informs that there are almost no biases in the stochastic frontier cost function, hich is quite opposite of preious research. This is because of first order Taylor series around zero inefficiency area so that the inefficiency is assumed to be almost zero. It is like heteroscedasticity in ordinary least square.

Third, let s focus on the aerage inefficiency( / π ) in table and table. In table, both of the aerage inefficiency are almost same hile in table, the aerage inefficiency ithout heteroscedasticity is times bigger than that ith heteroscedasticity. In other ords, as the ariability of the inefficiency error increases and dominates the random errors, the aerage inefficiency is oerestimated hen disregarding heteroscedasticity. Conclusions In the frontier estimation, the translog form ith aggregated data is mostly used and outputs are usually included in the ariance equation to see hether heteroscedasticity exists or not. Theoretically, the ariation of the (total) stochastic frontier cost function ith aggregated data increases as the output increases. When the translog form of the stochastic frontier cost function is estimated, all explanatory ariables can inersely affect on the error terms. Our Monte Carlo study shos that heteroscedasticity is significant in the random effect and the unexplained error term hile it is insignificant in the inefficiency error term. Also, it does not cause biases, hich is quite opposite of preious research. This is because our model is approximately defined by first order Taylor series around zero inefficiency area. Most importantly, disregarding

heteroscedasticity causes the aerage inefficiency to be oerestimated hen the ariance of inefficiency term dominates the other error terms. Using first order Taylor series around the mean of inefficiency error might be more close to the preious research. Then, the relationship beteen the inefficiency measurement for each obseration and output leel ith heteroscedasticity and ithout heteroscedasticity ill gie us more interesting findings. These ill be for the future research.

Table. Estimation Results ith 00 Samples of 00 Obserations ( λ ) Parameters Expected Value β 0 0 β β δ u δ δ u+ 0.0 0.0 MLE / Heteroscedasticity 0.4984 (0.04464) 0.8740*** (0.0783).00068*** (0.005).7660*** (0.05038) 0.83868* (0.5499) 0.05 a (0.0046) 0.0860 (0.00036) 0.000 (0.00003) MLE /o Heteroscedasticity 0.4584 (0.04804) 0.87530*** (0.097).0068*** (0.0060) 0.0848 (0.00035) 0.0000 (0.00004) /π 0.079 0.079 0.039 Note: Standard deiations are reported in parentheses. Asterisk(*), double asterisk(**), and triple asterisk(***) denote significance on aerage at 0%, 5%, and %, respectiely. a imposed beteen 0 and 0. because of conergence.

Table. Estimation Results ith 00 Samples of 00 Obserations ( λ ) Parameters Expected Value β 0 0 β β δ u δ δ u+ 0.0 0.04 MLE / Heteroscedasticity 0.7587 (0.0563) 0.77906*** (0.047).0000*** (0.0098) 3.787*** (0.08685) 0.75843* (0.33) 0.0076 a (0.00483) 0.0333 (0.0005) 0.00030 (0.00003) MLE /o Heteroscedasticity 0.7706 (0.0603) 0.77073*** (0.0403).00070*** (0.0005) 0.0395 (0.0005) 0.00064 (0.00005) /π 0.59 0.0380 0.005 Note: Standard deiations are reported in parentheses. Asterisk(*), double asterisk(**), and triple asterisk(***) denote significance on aerage at 0%, 5%, and %, respectiely. a imposed beteen 0 and 0. because of conergence.

References Aigner, D., C.A.K. Loell, and P. Schmidt. 977. Formulation and Estimation of Stochastic Frontier Production Models. Journal of Econometrics 6: -37. Caudill, S.B., and J.M. Ford. 993. Biased in frontier estimation due to heteroscedasticity. Economics Letters 4: 7-0. Caudill, S.B., J.M. Ford, and D.M. Gropper. 995. Frontier Estimation and Firm- Specific Inefficiency Measures in the Presence of Heteroscedasticity. Jounal of Business & Economic Statistics 3(): 05-. Dickens, W.T. 990. Error Components in Grouped Data: Is It Eer Worth Weighting. The Reie of Economics and Statistics 7(): 38-333. Farrell, M.J. 957. The Measurement of Productie Efficiency. Journal of the Royal Statistical Society. Series A(General) 0: 53-90. Greene, W.H. 980. On the Estimation of a Flexible Frontier Production Model. Journal of Econometrics 3: 0-5. Hadri, K. 999. Estimation of a Doubly Heteroscedastic Stochastic Frontier Cost Function. Journal of Business & Economic Statistics 7(3): 359-363 Jondro, J., C.A.K. Loell, I.S. Matero, and P. Schmidt. 98. On the Estimation of Technical Inefficiency in the Stochastic Frontier Production Function Model. Journal of Econometrics 9: 33-38.

Meeusen, W., and J. an den Broeck. 977. Efficiency Estimation from Cobb-Douglas Production Functions ith Composed Error. International Economic Reie 8: 435-444. Schmidt, P. 986. Frontier Production Functions. Econometric Reies 4: 89-38. Weinstein, M.A. 964. Query : The Sum of Values from a Normal and a Truncated Normal Distribution. Technometrics 6(): 04-05.