Using a Double-Constrained Gravity Model to Derive Regional Purchase Coefficients

This paper describes a gravity model to estimate gross trade flows between states. The model is doubly-constrained so that domestic imports and exports “cancel out”; that is, when domestic imports and exports for all states are summed, they are equal for each commodity. In other words, all sources of supply and demand are accounted for. The results of this model will be incorporated into IMPLAN software. This updates the “current” RPC methodology which was an econometric formulation using MRIO data which was ultimately based on the 1977 Commodity Flow Survey. An econometric based method using the recently released 1993 Commodity Flow Survey was rejected because of the failure of the data to identify the producers and distributors.

Introduction

Determining the gross commodity import and export flows are of fundamental importance to successfully deriving regional inter-industry accounts. Estimates of regional purchase coefficients (RPCs), the rates of gross import purchasing by a commodity user, have long been the primary focus of research in this area. However, estimating RPCs on the basis of single-region characteristics do not take into account spatial variables like a region’s place in a trade hierarchy, or the proximity and size of alternate markets. These phenomena are best modeled using spatial interaction, or gravity, models.

A gravity model is similar to, and named for, Newton’s Law of Gravity, whereby, the attraction between to masses is based on the size of the mass and the distance apart. Usually shown as:

Equation 1

Where G is a constant representing the force of gravity.

Spatial interaction systems model the gross flows between nodes, such as the import and export flows between regional economies. In general terms, the import and export flows between regions are thought to be proportional to the “mass”, “attractiveness” or “size” of an economy and inversely proportional to the “distance” or cost of moving goods and services between them. Mass variables often are interpreted as gross supply and demand while distance is frequently equated with the cost of moving goods and services from one location to another. A simple gravity model for a given commodity would take the form:

Equation 2

where: Tij = Trade flows between regions i and j;

Oi = total commodity supply originating in region i;

Dj = total commodity demand orginating used in region j;

= is the distance funcion;

G = constant of trade (gravity);

b = exponent: the larger the exponent the greater the friction to trade.

The trade interaction varies directly with the size of the attraction and inversely with the distance between them.

If we ignore distance (i.e., b=0 and the denominator of equation 2) becomes 1) trade between regions i and j only relies on supply satisfying a suitable demand. Supply from region i will go to meet the demand at region j based on j’s proportion of all region’s demands:

Equation 3

Where Pij = is the probability of supply from i going to demand j;D = (sum of all Ds)

Equation 4

Table 1. An example of demand for 3 region model

Region

Supply of Commodity (Oj)

Demand for Commodity (Dj)

Share of Demand (Dj/D)

1

100

200

0.2

2

600

300

0.3

3

300

500

0.5

Total

1000

1000

1.0

Based on the simple model, as shown in table 1, the probability is that 20% of region 1’s will go to region 1. Since region 1’s supply is 100 units, T11 = 20.

We can create a matrix of trade flows based on the information provided in table 1.

Table 2. Matrix of trade flows based on the assumption of no trade friction

Region

1

2

3

Total

1

20

30

50

100

2

120

180

300

600

3

60

90

150

300

Total

200

300

500

1000

However, the attractiveness of a region decreases across distance as a result of time and cost to deliver the goods. Experience has shown that this decrease is not linear, hence, the exponent “b”.

Equation 5

Equation 6

Experience has also shown that equations 5 and 6 overestimates volume of shorter hauls (Isaard, 1960 and Carroll and Bevis, 1957) so we modify the denominator to account for all competing sources of demand.

Equation 7

Equation 8

Note from equation 7 that the sum of all probabilities (sum Pijs) is 1.0. Therefore, the sum of all trade to all regions j is equal to the total supply from region i:

We can simplify equation 8 by recognizing that D cancels out and by setting:

Equation 9

We can in this way derive equation 10:

Equation 10

IMPLAN Accounts for Gross Commodity Supply and Demand

The IMPLAN social accounting system offers unique opportunities for modeling domestic region-to-region trade flows with spatial interaction models. Primary among these opportunities is accounting consistency: the accounting system is complete and consistent. Gross supply and demand for all regions add up to supply and demand on a (domestic) national basis.

This advantage can be exploited as follows. For gross domestic commodity supply,

Equation 11

Where: M is the byproducts matrix for region r;

x is the industry output for region r;

z is non-industry (institutional sales) output for region r; and

f is foreign export from region r.

s is the domestic commodity supply for region r.

Similarly, for gross domestic commodity demand,

Equation 12

Where: A is the gross absorption matrix for region r;

x is the industry output for region r;

y is gross final commodity demand for region r;

f is foreign export from region r.

d is gross commodity demand for region r; and

When summed over all regions (e.g., states or counties),

Equation 13

That is, all sources of supply and destinations of demand for commodities are accounted for by the system. Domestic trade flows between regions, of imports and exports are similarly closed.

A Double-Constrained Gravity Model for Trade Flows

In equation 9, we have a single constrained model in that the sum of all trade flows Tij is equal to the supply from all region. However, both supplies and demands for all commodities are known for the system, an interaction model is required to estimate the trade flows between locations or regions of supply and demand. In general, the form of this model for a particular commodity is as follows.

Equation 14

Where: Tij = trade flows between regions i and j;

Oi = total commodity supply originating in region i;

Dj = total commodity demand originating used in region j;

Ai = Equation 15

Bj = Equation 16

And

= is the distance function.

We have introduced a new term Bj. This formulation assures that the two constraints

Equation 17

and

Equation 18

are satisfied; in other words both the known supplies (Oi) and demands (Dj) can be correctly obtained from the interaction data.

Notice that the expression for Ai includes the term Bj, while that for Bj includes Ai. This means that they must be computed by iteration. The iteration is accomplished by first setting Bj = 1 and solving for Ai(eq. 15). The resulting Ai is then plugged into equation 16. The process is iterated until Ai and Bj no longer change.

Ai and Bj are constants derived through the iteration process and incorporate the inverse distance relationship between regions i and j, as well as, force the constraints set by equations 17 and 18 to be true.

Implementation in IMPLAN

Double-constrained trade flows can be accomplished at the state level, and then at the county level within each state. Or it can be calculated for all ~3200 counties simultaneously. The advantage of calculating all counties simultaneously lies in the consistency of regions that cross state boundaries. The disadvantage is the billions of bits of data that has to be handled. The data exists to mechanize the creation of multi-region models. Within the IMPLAN data itself the probability values can be stored which allow a user to modify region data without fouling up trade flow estimates.

Model Validation

We can represent the distance variable (d) as simple centroid distances or, preferably, use a cost per mile moved function mapped out over transportation networks. The distance exponent (b) can most simply be set to “two” or can be adjusted to calibrate the model to match existing trade flows data.

The most recent and perhaps the most complete data for trade flows is the 1993 Commodity Flow Survey (Bureau of the Census 1996). How can we use 1993 Commodity Flow Survey to calculate trade flow equations? What are the drawbacks? The CFS is a survey of domestic firms to estimate commodity origination and destination data at the 2-digit commodity data for states and 3-digit commodity data for CTARs (transportation regions). The Bureau of the Census collected data from commodity producing and shipping industries as shown in table 3, below.

Given perfect information we could use the data from the CFS to determine how much of local supply was used to meet local demand for each state. The ratio of local demand met by local supply is called the Regional Purchase Coefficient (RPC) (Stevens and Trainor, 1980).

Unfortunately the CFS data on the (available on CD) is not perfect.

The first problem is that import data must be derived on a state-to-state basis which forces over half of the observed values to be non-disclosed. However, we have been told by the Bureau of Transportation (co-sponsor of CFS) that import data by state by 2-digit commodity from all domestic sources will be included in the 1997 CFS which is scheduled to be released about mid-1999.

Second, the CFS does not differentiate shippers who are manufacturers versus shippers who are resellers –i.e., the wholesale sectors and the mail order houses. This has two effects:

manufacturing is shipped and reshipped –i.e., there will be double counting;

it combines wholesale and retail purchaser prices with manufacturing producer prices which may alter the relationship we might be able to obtain on net exports (assuming we could overcome the import non-disclosure problems).

How important are resellers in state-to-state commodity flows?

The imbalance between Exports and Value of shipments by commodities is tremendous. For example, for the 663 sampling points we have available (–i.e., disclosed and non-zero data available for both Value of Shipments from 1993 Annual Survey of Manufacturers and outflows from the 1993 CFS for 2-digit SIC by state), there are only 52 cases where Value of Shipments exceed exports. Five percent of the data points have Value of Shipments exceeded by outflow by a factor of five!!!

Figure 1

Figure 1 shows the ratio (vertical axis) of export to value of shipment (production) values for food processing for all states (horizontal axis by FIP codes). A value of greater than one means that exports exceed production. In all cases, the value is greater than one. All 2-digit SIC codes reflected this relationship.

Surely, wholesalers and mail order houses cause a tremendous amount of import to re-export activity. This is exacerbated by the “invisible production” represented by foreign imports. The upshot is that validation, in the near term will have to be on a subjective basis comparing the gravity model results to data that is available in the CFS.

The CFS has a wealth of other data by commodity, including tons, ton-miles, average miles and value of shipments, by commodity and transportation mode which will greatly improve the gravity model.

1/ This model is an adaptation of a double-constrained gravity model for trip distribution described by Lee (1973).