Careful with the Data, Cost of Living Edition

The two most commonly used cost of living measures these days are the BEA’s regional price indexes and the C2ER’s cost of living index. The BEA figures are relatively new and have only been around a couple years at this point. They provide data at the metro and state level for just a few broad categories: all items, goods, rents and other services.

The C2ER data (formerly ACCRA) has been around a long time and provides much greater detail when it comes to the cost of various items, all the way down to the price of a men’s dress shirt or a T-bone steak. I know because I used to do this pricing survey at my previous job. I drove to 5 different stores in the area and priced dozens of products to provide a local estimate of cost. This makes the C2ER data the best choice at the local level, no question. However the issues arise in that the C2ER data only covers areas that are priced, thus not making it representative of other, or larger areas. Unfortunately they roll up the local prices into statewide measures. Even though C2ER uses a population-weighted average, it is still not representative if your state does not have good coverage in terms of the underlying cities that are, you know, actually used in the population-weighted average calculation.

All of which brings me to the point and the title of the post. Be careful with your use of data! I have seen this specific example a few times over the years in various reports (both government and consulting, e.g.) that use the state level C2ER data. It reared its head again the other day with the money-rates.com ranking of “best states to make a living in” which generated lots of public comments and questions across the web.

For comparison purposes, below I show both the BEA and C2ER measures for Oregon and Washington.

The BEA measure shows Oregon just below the U.S. average and Washington just above. However the C2ER data shows Oregon significantly above the U.S. average and Washington just above. What’s going on here? My suspicion is it has to be the underlying cities used to generate a statewide average. If you look at the C2ER data for just Portland vs Seattle, it shows Portland is certainly above the U.S. but 1-2% lower than Seattle. So if the 2 largest cities are so close, how can the state differences be so large? In reality, they can’t. It’s not realistic to assume that living in Pendleton is more expensive than Tri-Cities to such a degree that it overwhelms the similarities in the states’ largest cities to make the statewide average nearly 25% different. C2ER data is not useful at the state level, however it is the best, most detailed data available at the city level.

To summarize, be careful with your data. It’s always helpful to know how your data is constructed and how best to use it. Of course there are times you must use the data you have, not the data you wish you had. However in this case there is a clear alternative that makes for better use at the state level.