The objective of this study was to identify groups of technologically homogeneous milk producers in the State of Goiás and calculate the gains for inefficient producers, considering the ones of better performance (benchmarks) with the same technological level, i.e., in the short term, and the benchmarks in higher technological level, i.e., in the long term. Multivariate statistical analysis techniques were used for the formation of homogeneous producer groups and multi-stage model of data envelopment analysis (DEA) for the estimation of efficiency scores and identification of benchmarks. The results indicated the formation of four groups of producers and indicate that, on average, the dairy farming in Goiás presents mainly characteristics of traditional, less specialized production. Estimated average efficiency for producers as a whole, or efficiency of long-term, was 0.571. The short-term results vary according to the group analyzed, but were higher than the long-term values, indicating that the factor analysis promoted technological homogeneity between the producers of the same group. Results showed a trend in dairy farming of Goiás towards adequacy of the input/output ratio to the standards of less capitalintensive production systems.

Key Words: benchmark, cluster, dairy farming, efficiency

Introduction

Historically, the state of Goiás is among the top five states producers of milk in Brazil. In 2010, it occupied the fourth position, producing 3.193 million liters, which corresponded to about 10.4% of the total production and 10% (R$ 2.11 billion) of the Gross Value of Brazilian National Milk Production (IBGE, 2012). While the production of milk in Goiás grew 5.13% per year, from 1990 to 2010, the national growth was 3.66% per year (IBGE, 2012). Even with such developments, one must be alert to the efficiency in dairy farming, in order to make it more attractive in comparison with other productive activities or financial market investments.

The new competitive environment resulting from the transformations in the Brazilian dairy farming, highlighted by authors such as Gomes & Ferreira Filho (2007), Gonçalves et al. (2008) and Siqueira et al. (2010), boosted the development of the sector. Although the Brazilian dairy farming still has low average productivity, the activity has gradually specialized and, notably, the biggest producers have obtained gains in productivity and efficiency (Alvim et al., 2009).

In the search for increased efficiency and productivity, farmers of better performance become reference or benchmark for other producers. However, to be based on a "reference farmer" to propose improvements to another producer, it is necessary to consider the technological characteristics of each one of them and the restrictions to changes in technology; for so, it is important to consider the time horizon.

Studies of Tupy & Yamaguchi (2002) and Gonçalves et al. (2008), who analyzed the technical efficiency of dairy farming, can be interpreted as long-term results, considering that all inputs can be modified. However, this reasoning cannot be applied in the short term, since the level of technology should be considered constant. This study breaks new ground by adapting the approach of Barua et al. (2004) and Brockett et al. (2005), measuring short and long term efficiency, considering the production possibility frontier of each technological system.

Thus, the objective of this study was to identify groups of technologically homogeneous producers and calculate the gains for inefficient producers when corrections in inefficiency are made, considering the benchmarks of the same technological level, i.e., in the short term, and all technological stages, i.e., in the long run.

Material and Methods

In the literature on productivity, efficiency is understood as the comparison between observed and optimal values of inputs, outputs, revenues, profits, and/or costs. Considering the optimal product, Koopmans (1951) defined the optimal production as technically efficient, if it is not possible for the company to increase the amount produced of any good, keeping the production of the others constant, or if the company is not able to reduce the use of any input, keeping the production constant.

Considering the production of milk from a producer who uses a given vector of inputs x = (x1,...,xN) ∈ which results in a vector of products y =(y1,...,yM) ∈ and making use of the technology T, we have:

Thus, in a rural property, the inputs are the materials, represented by the means of production such as land, labor and others; the processing is done by the technology employed; and the outputs represent the products of the productive effort and technology. The efficiency relates, therefore, to the conditions of operation of the system, i.e., the best use of the inputs to optimize the output, given the technology available (Ferreira et al., 2009). Thus, productive efficiency will depend on the internal factors of the company, such as administrative skills, information access and competitiveness, since in a capitalist competitive environment, prices, legislation, among other factors, are considered data (Piot-Lepetit et al., 1997; Ferreira et al., 2009).

In this context, equation (1) assumes that all companies have the same technology in the estimation of efficiency, which can actually not be considered as real and, moreover, the producer is considered able to correct all sources of inefficiency, thus assuming the existence of changes in technology and the various factors of production fixed in the short term. In fact, the producer may make minor adjustments in the productive system. Therefore, in the short term, the producers adopt the technology in use. On the other hand, in the long run, all resources are subject to change, implying the possibility of changing the technological level of the producers who invest in the activity.

In order to determine and design the homogeneous groups of milk producers in relation to the technological level adopted, factor analysis and cluster analysis are used. After that, the multi-stage model of data envelopment analysis (DEA) is utilized in order to estimate the technical efficiency of producers in each group (short term) and as a whole (long term).

Factor analysis consists of the description of the original variability of random vector X, in terms of a smaller number (m) of random variables that summarize the information of the original variables (Mingoti, 2007). By reducing the information contained in a set of variables, factor analysis facilitates the interpretation of results, the formation of groups of producers with similar technological systems and the determination of corrections in production systems in the short term.

The factor analysis model can thus be written as (Härdle & Simar, 2003):

in which fl for l = 1, ..., k denotes the factors or latent variables in which the number of k factors must be lower than the p number of variables and xj, the random variables of the vector X; qjlthe factorial load or loading of the j-th variable and l-th factors, and uj, error on the j-th variable.

The interpretation of the factors is based on the load factor of the variables used, demonstrating the correlation between the variable and the factor. Additionally, so that the interpretation was easier, values of the factorial scores (F+ji) were limited to the range between 0 and 1 in order to facilitate understanding, via the mathematical procedure: in which is the lowest score observed for the j-th factor and is the highest score observed for the j-th factor.

The adequacy of the factor analysis model can be made from two tests: Bartlett's test of sphericity and the criterion of Kaiser-Meyer-Olkin (KMO). The first tests the null hypothesis that the correlation matrix is an identity matrix, since the factor analysis assumes that the variables are correlated. The critical level of probability adopted was α= 0.10. The KMO criterion assesses the adequacy of data to methodology, by comparing the ratio between the partial and simple correlations assuming values between 0 and 1. According to Hair et al. (1995), KMO values higher than 0.7 indicate good adequacy of the sample to factor analysis.

Cluster analysis is used to form homogeneous groups of producers regarding the technological system. The twostage method proposed by Punj & Steward (1983) was utilized for obtaining groups. This method first determines the number of clusters, by the method of minimum variance of Wald, by the Je(2)/Je(1) stopping criterion proposed by Duda & Hart (1973). Subsequently, the average k method is used to distribute producers among the various groups determined previously (Toyoshima et al., 2005).

The method of Wald is based on two principles, according Mingoti (2007): (a) initially, each element is considered a single cluster, and (b) in each step of clustering algorithm, two clusters are combined at once, minimizing the distance. When united, two clusters can no longer be separated. The distance between two clusters (Cl and Ci) is defined by:

in which: ni = number of elements in the cluster Ci; ni = number elements in the cluster Cl; and = centroid (vector of sample means) of conglomerate Ci and Cl, respectively.

The method of k-mean consists in allocating properties to the group whose center is closer to the vector of observed values for the respective element. The method comprises four steps. Firstly, k-centroids are chosen to be used at the beginning of partition, and then each element is compared with each centroid accordingly to the distance. The shortest distant element is allocated to that group, and this procedure is applied for each n elements; the centroid values are then re-calculated for every new group, repeating the second and third steps until no relocation is possible (Mingoti, 2007).

With the information regarding the number of groups and technical characteristics of the producers, the next step is to calculate the production efficiency, considering the technological specificities and limitations to changes in the productive system.

Initially, the data envelopment analysis is applied for all groups together, under the assumption of constant returns and product orientation, to allow comparisons of efficiency scores betweem groups and to obtain the longterm efficiency. This procedure is necessary because when comparing decision making units (DMU) of different groups in the same model, joint estimated efficiency, the average efficiency has an absolute sense; however, comparing technical measures of efficiency between groups, estimated separately, it has the interpretation of homogeneity, that is, groups with higher scores are more homogeneous (Gomes et al., 2009).

Subsequently, given the characteristics of each group and the operating limitations of the model, the efficiency scores are estimated separately, only for specific groups. The intention is to propose changes in the productive system of inefficient producers based on similar technology producers and those with better performance, i.e., to determine the corrections in the production system in the short term for the producers of the same group.

In the case of the estimation of efficiency scores and identification of benchmark, the multi-stage DEA method proposed by Coelli (1998) was used. This method allows identifying the point on the efficient production frontier, considering any gaps. The method comprises six steps, described below.

K inputs and M products are assumed for the N firms, in which xi and yi are vectors for each firm. The inputs matrix is of N × K order, and the products matrix, of M × N order. For constant returns and product orientation, the first step is to solve the following linear programming model:

The third step identifies the slacks in the use of inputs by means of K sequences of linear models. The linear programming model for the i-th name is specified:

in which: cxij = j-th input of the i-th name, which can be reduced by multiplying it by θ obtained in step 1; Xej = vector of inputs for all firms; = vector of inputs for firms excluding the j-th input; = matrix of inputs of all efficient firms (excluding the j-th input); Ye = product matrix of efficient firms; λ = matrix of dimension Ne x 1, in which N is the number of efficient firms.

The fourth step consists of operationalizing the linear programming model in order to reduce all potential inputs identified with slacks, expressed as:

in which the superscript "s" identifies the subset of inputs with the potential slack and "ns" refers to the remaining inputs.

Step 5 takes the point identified in step 4 and repeat steps 3 and 4 until all the slack of inputs has been eliminated. The final step (6) takes the point projected by step 5 (to the i-th firm) and repeats steps 3-5 through radial expansion in the slacks of the product, until they are removed.

The Wilcoxon (1945) test was used for the purpose of comparing and analyzing the mean values for the efficiency scores between groups of producers.

The data used in the study were provided by Federação de Agricultura e Pecuária de Goiás -FAEG (FAEG, 2009) and are part of the Diagnosis of Milk Production Chain of Goiás, published in 2009: INST: value of the facilities, in Brazilian Real (R$); MAC: value of machinery, in R$; PAST: area with natural and man-made pastures, in hectares; CONC: expenditures on food concentrates, R$/year; FLB: familylaborused,R$/year;HLB:hiredlabor,R$/year;ASTEC: technical assistance (binary variable that assumes value 1 if the producer receives technical assistance and 0 otherwise); EXPER: experience of the producer, years of activity; IMILK: income from milk, R$/year; VARIA: Variation in production between rainy and dry seasons, in %.

Variables used in data envelopment analysis: a) Produtcs: Rmilk: revenue from the sale of milk produced on the property (consumed by the family and sold), in Brazilian Real (R$); Ranimal: total revenue from the sale of animals, including animals consumed by the family, in R$; b) Inputs: Area: area for dairy farming, in hectares, comprising areas of natural and formed grassland, production of grains for silage and the area intended for production of grass, in hectare; Mach: capital obtained by summing of the value, in R$, of tractors, forage choppers, sprayers, tanks for milk, bottled semen, wagon, milking machine, plow, irrigation equipment, scales for weighing animals, vehicle used for cattle and others, weighted by the percentage of use dedicated to dairy farming, in R$; Instal: improvements obtained by the sum of the value in R$, the stable, barn, milking parlor, trunk, silo, calf barns - individual and collective, feed storage room, machinery, fences, power, dam, facilities for expansion tanks, accommodations of employees, weighted by percentage of use of the activity, in R$; Labor: labor obtained by the sum of labor intended for dairy farming activities, weighted by percentage of use of dairy farming, including family and hired workforce, in equivalent man/year); Ncows: sum of the number of heads of milking and dry cows on the farm, in heads.

The data used in the study were provided by Federação de Agricultura e Pecuária de Goiás -FAEG and are part of the Diagnosis of Milk Production Chain of Goiás, published in 2009. Data collection was conducted from July 2008 to June 2009, comprising 500 commercial producers. Models of factor and cluster analysis were estimated using Stata (Data Analysis and Statistical Software, version 2.10), while the multi-stage DEA models were calculated using software DEAP (Data Envelopment Analysis (Computer) Program, version 2.1).

Results and Discussion

The model of factor analysis fitted to the data, considering the Bartlett's test of sphericity (P<0.10) and the Kaiser-Meyer-Olkin (KMO) test, whose value was greater than 0.70 (0.8474). The variables (Table 1) were reduced to two factors that explain together, 51.57% of the total variance of the variables.

Based on the values from factor loads (Table 1), one can verify that milk production has more weight in the estimation of factor 1, while the costs are more significant for factor 2. However, only factor loads greater than 0.50 were used for the interpretation of factors.

Factor 1 was called "specialized production", since it is more correlated (positively) with the variables related to the value of plant, machinery, pasture area, the expense of hired work force and revenues from milk production, assuming that more capital-intensive properties, with higher expenses and revenues, are those with the highest level of expertise.

On the other hand, factor 2 was defined as "traditional production" because of the importance of variables related to the experience and production variation between the rainy and the dry season). Usually, producers with more time in the activity and undercapitalized tend to have lower value of investment in plant and machinery, spend less with concentrates, make greater use of family labor, have little access to technical assistance, have lower returns from dairy farming and the production of dairy varies significantly between the rainy and dry seasons.

After the standardization, factors F11 and F22 resulted in four groups of milk-producing properties in Goiás (Table 2) increasing order of specialization (F11).

It is possible to verify that groups G3 and G4 are those with more similar characteristics, although within the group G3, F22 has great importance, which determines that this is the group of least specialization. Group G1, comprising 40 properties, has an intermediate behavior, since the factor referred to the specialization of dairy farming gains in importance while the second is reduced, compared with the previous two groups. Group G2, although restricted (seven properties), is the one that has most marked characteristics of specialization and capitalization. It should be noted that low skill factor has relatively high weight in all groups, implying that, on average, the dairy farming in Goiás presents mainly characteristics of traditional production.

The mean values for each one of the four groups (Table 3) follow the order established in the description of groups listed (Table 2), i.e., group G2 shows the highest values, and group G4, the least significant.

Once the groups are identified, the technical efficiency was calculated, considering the technological specificities and limitations to changes in the productive system. While the short-term technological change is not permitted in the long run, producers may make all necessary adjustments as to technologies and fixed inputs.

The average efficiency estimated for long-term was 0.571, that is, on average, production can increase by 42.9%, using the same quantities of raw materials, if the inefficiency is corrected.

It is possible to verify by means of descriptive statistics for the efficiency estimates (Table 4) that, although group G2 is the most capitalized and with greatest financial flow (Table 3), it is the only one to not present efficient properties, while the other less capitalized groups present an index of about 10% of technically efficient properties. This indicates that the most capitalized properties use inputs in excess, when proportionally compared with producers with lower cash flow.

The average short-term efficiency for producers of group G1 was 0.791. It was found, through the Wilcoxon test, that this value was higher than the average efficiency for the group over the long term (P<0.10), calculated with all producers in the sample together. This shows that the factor analysis, in fact, resulted in greater technology homogenization of producers in the group, given also the microeconomic assumptions between short and long term. For more details, see Brockett et al. (2005).

Thereis high possibility ofgains inthequantityproduced (optimum value) in the short and long terms (Table 5). In the short run, the output can grow substantially, through the structuring of the production system, resulting in the removal of sources of inefficiency - the revenues from the sale of animal and milk can be increased by 28% and 106%, respectively. In the long term, with the incorporation of new technologies and the consequent increase in production, the gains can be increased by 212% for both products. The percentage gain is the same for the two different revenues because the data envelopment analysis (DEA) calculates the inefficiency and corrections radially, which implies equal corrections, provided there are no gaps.

The results demonstrate the importance of considering the time when estimating the performance of production units and especially when changes are proposed in the productive system, designed to increase production or reduce the use of inputs.

In the case of group G3, the estimated average efficiency, considering only the producers of this group, was 0.612. Just as the group G1, it was found, using the Wilcoxon test, that this value was higher than the average efficiency for the members of group G3 (P<0.10), when all producers of the sample are included. Thus, it must be emphasized that the factor analysis technology promotes greater uniformity among the producers of the same group.

Based on the estimates (Table 6), it is appropriate to fix the sources of inefficiency, since in the short term, the producer may raise revenues by selling animals and milk in 104% each. Likewise, long-term gains may be even greater: 166%. These factors contribute to increase in the income of producers and improvement in the quality of life and the perspectives of producers with respect to the activity. Thus, as can be seen for group G1 (Table 5), there is clarity about the importance of technological innovation in efficiencies (production) of the producers of group G3.

The results of the distribution of benchmark properties per groups in the long run (Table 7) can be interpreted as a tendency to the production system. Since there are no fixed factors in the long run, producers should seek to increase efficiency by adopting the production systems similar to those of the benchmarks. It is observed that 52% of the benchmarks are from group G4, the least capitalized among all, and followed by groups G3 and G1, with 33% and 14% of the benchmarks, respectively. Despite being the most capitalized, group G2 does not show any benchmark property, since none were effective. Moreover, using estimates of long-term efficiency, through the Wilcoxon test, only the efficiency scores of groups G3 (P<0.10) and G4 (P<0 10) are higher than those in group G2, confirming the tendency of producers to adjust to features similar to those of producers from G3 or G4.

This result was already expected, since, by the descriptive statistics for the groups (Table 3), the producers of group G2 use proportionally more inputs to produce the same amount, if compared with other groups. Besides, according to Gomes (2001), the competitiveness of Brazilian milk production lies in the semi-extensive production system, in which pasture is provided to feed the herd during the rainy period, and during the dry period, the feeding is done with silage, hay, etc... This system has the advantage of reducing costs during "the rain" when the milk price is usually lower, and high cost of production during "the drought" when the milk price is higher.

Conclusions

The dairy farms of Goiás are mostly technically inefficient, even when taking into account the particular characteristics of homogeneous groups of properties and differences in short and long terms. In the short run, the producers considered technologically intermediate have lower earning capacity, although they are more competitive when compared with traditional farmers. The latter, in turn, are more competitive in the long run. The trend of milk production in the State of Goiás points to less capitalintensive production systems.