Browsing Posts in Econometrics

A new Stata command for spatial differencing estimation is now available. sreg fits the following model:

where is the outcome of unit located in area with , is a -vector of exogenous covariates, is the idiosyncratic error and is an unobserved local effect for the unobserved location , , possibly at a finer spatial scale than . Estimating this model by ordinary least squares ignoring gives a consistent estimate of only if . If we set aside this unrealistic assumption and allow for arbitrary correlation between the local unobservables and the explanatory variables, i.e. , a non-experimental approach to estimating the model involves, in some way, transforming the data to rule out . An increasingly common way to deal with this issue is the so-called spatial differencing approach.

sreg implements the spatial differencing estimator described in Belotti et al. (2017), as well as different variance-covariance estimators, among which the dyadic-robust (Cameron and Miller, 2014) and the Duranton et al. (2011)'s analytically-corrected estimators. Of special note is that sreg also allow to apply the spatial differencing transformation to units located in contiguous clusters ("boundary-discontinuity" design).

A new Stata command for the consistent estimation of fixed-effects stochastic frontier models is now available. sftfe fits the following fixed-effects stochastic frontier model:

where is a normally distributed error term and is a one-sided strictly non-negative term representing inefficiency. The sign of the term is positive or negative depending on whether the frontier describes a cost or production function, respectively. sftfe allows the underlying mean and variance of the inefficiency (as well as the variance of the idiosyncratic error) to be expressed as functions of exogenous covariates. Of special note is that sftfe allows the estimation of models in which the inefficiency is assumed to follow a first-order autoregressive process.

Technical details related to the estimators implemented in the command can be found in the article:

Missing data can pose major problems when estimating econometric models since it is generally unlikely that missing values are Missing Completely At Random. A strategy to address this issue without using more complex econometric approaches is represented by multiple imputation, that is the process of replacing missing values by multiple sets of plausible values. This post provides a simple example in which xsmle is used together with mi, a Stata's suite of commands that deals with multiple data imputation. Consider the following data in which the only regressor (x1) has 14 missing values

Since these missing values make the panel unbalanced, xsmle will not be able to work. Nonetheless, we can overcome the obstacle by exploiting mi and xsmle jointly.

The first step is to declare the dataset as a mi dataset. The command mi set wide is the appropriate one. Indeed, data must be mi setted before other mi commands can be used. It does not matter which mistyle you choose since you can always change it using mi convert. In this example, I choose the wide style.

Once the dataset has been mi setted, the second step is to register the variables with missing values. In this example, x1 is a variable that has missing values and mi register imputed x1 declares this variable as a variable to be imputed. A good practice for reproducible results is to set the seed of the Stata's pseudo random number generator using the command set seed #, where # is any number between between 0 and 2^31-1.

Then, the command mi impute regress x1 = i.cat, add(10) can be used to fill in missing values of x1 using the set of dummy variables from the categorical variable cat through the regress method (see help mi impute for detail on the available methods). The option add(10) specifies the number of imputations to add to the mi data (currently, the total number of imputations cannot exceed 1,000). After mi impute regress x1 = i.cat, add(10) has been executed, ten new variables _#_x1 (with # = 1,...,10) will be created in the dataset, each representing an imputed version of x1.

Finally, as documented in help mi estimate, the prefix command mi estimate: estimation_command can be used to execute the estimation_command on the imputed _#_x1 variables. This command will adjust coefficients and standard errors for the variability between imputations according to the combination rules by Rubin (1987). In this example, the command

mi estimate, dots: xsmle y x1, wmat(W) model(sdm) fe type(ind)

estimates a spatial fixed effects Durbin model on the ten imputed versions of the x1 variable.

References

Official Stata manuals and help files.
Rubin, D. B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: Wiley.

Given the high number of requests, my colleagues and I have decided to provide a brief tutorial on how to get started with xsmle. First of all, you have to install the command by typing

net install xsmle, all from(http://www.econometrics.it/stata)

It is fundamental to use the option all because when you install a user-written package without using this option, the ancillary files (in this case "product.dta" and "usaww.spmat") won't be downloaded, while with the all option all the ancillary files will always (conditionally on the presence of an internet connection) be downloaded to 1) your current working directory OR 2) the directory you specified for ancillary files using the net set other command.

Now, lets assume that you haven't changed your "ancillary files" directory and that your Stata current working directory is the Desktop. Then, you should now find on your Desktop two new files: "product.dta" and "usaww.spmat". These files contain informations on public capital productivity in 48 US states observed over 17 years, as well as the spatial weights matrix for the US states (Munnel, 1990). The examples reported in the xsmle help file make use of these files, also available in R through the Ecdat package.

The first step is to load the "product.dta" into memory by typing

use product.dta, clear

Then, you have to do the same for the spatial weights matrix contained in the "usaww.spmat" file. Being an spmat file, this can be done by typing

spmat use W using "usaww.spmat"

At this stage, after computing the logarithm of the dependent and independent variables, we have all the ingredients to estimate a spatial panel data model. For instance, the syntax to estimate a Spatial AutoRegressive (SAR) model with random effects is

xsmle lngsp lnpcap lnpc lnemp unemp, wmat(W) model(sar)

Notice that the dataset "product.dta" has been already declared to be a panel. In general, you need to declare your dataset to be a panel dataset by using the xtset command.