I'm looking at some massive data, most of which has been geocoded to the block face (front of the building) level, while some of which has been geocoded to the zip code level (less accurate), giving me a lat/long pair for every sample. When we run a test on the accuracy of the zip code level geocoding, there's a largish error compared to address level coding (test was possible since some limited sample addresses were available even for samples coded at zip code level) - a median difference of 0.63 miles. However, on plotting the lat/long pair on google maps, I see a very specific location.

What then does the error tell us?

Is the lat/long point plot (from zip geocoding) an erroneous one?

Could anyone explain the difference between zip and block face geocoding, in the sense of loss of accuracy for the former?

Finally, if I am to convert lat/long and zip code information to census tract, does the fact that the sample has been coded to zip level make a precision difference in this exercise?

1 Answer
1

Don't trust ZIP code geocoding; they are accurate to within a mile at best. See this post for more information:

One of the biggest misconceptions GIS users have about ZIP codes is that they are a set of polygons that cover the United States--they are not. ZIP codes are a system used by the Postal Service for sorting mail before delivery, and nothing more. If an address receives enough mail, the USPS will just assign them a ZIP code to improve sorting efficiency. Many post offices also have a separate ZIP code just for their P.O. Boxes.

The ZIP Code Tabulation Areas (ZCTAs) are a best-estimate made by the Census Bureau about the spatial delineation of ZIP codes, reverse-engineered from responses to the Bureau's surveys. But it is not exact. I've found significant differences between ZIP codes (from geocoded addresses) and ZCTAs, especially in areas with uneven distribution of postal mail, like dense urban centers, commercial and industrial parks, and very rural areas.