Sunday, August 10, 2014

Guns are cool - Regions

This was supposed to be a post in which General Social Surveys (GSS) data were used to understand a bit more about the causation of differences between states. Thus it was to give additioanl insight than my previous post; Guns are Cool - Differences between states. Unfortunately, that did not work so good, and it ended as a kind of investigation of regions.

Data

Analysis

The GSS have data by region rather than by state. Hence incidences by region were calculated. After aggregation the following results were obtained. From analysis point of view, New England forms somewhat an outlier. It is so much lower that any model should cater to this feature. Since there are only nine regions, which is quite low to do an analysis, and potentially tens of independent variables, I gave up on linking this incidence to GSS. StateRegion shootings Population Incidence5 New England 12 14618806 0.051480084 Mountain 30 22881245 0.082226448 West North Central 30 20885710 0.090082809 West South Central 58 37883604 0.096016663 Middle Atlantic 68 41970716 0.101609066 Pacific 91 51373178 0.111089977 South Atlantic 110 61137198 0.112838432 East South Central 36 18716202 0.120629811 East North Central 101 46662180 0.13574575

Regions and days

I had been wondering if the week day effect observed before would vary between states. But spreading the data over all states would thin the data too much and make plots unappealing. The nine regions are much better. While the Sunday/Weekend effect is present in all regions, its size differs markedly. Sundays seem particularly bad in West North Central. Three regions, New England, West South Central and Mountain have no shootings on one weekday.

Hierarchical model for estimates of incidence

In Guns are Cool - States I lamented that some states may have looked worse because the model pulled them too strong. The regions may provide a solution for this, like goes with like. After some twiddling of the model and quite some tweaking of hyperpriors, a new plot of States' incidence was made. It should be noted that I am not happy about parameters a[] and b[], their Rhat is 1.15 and effective sample size 20. Partly this is because they covary. The beta parameter calculated from a[] and b[] looks much better. It should also be noted that, during development an older version of the 2014 data was used, which made New England look better.

In the plot the states are a bit closer to each other than before, which is not what I expected. New England is near the bottom, as is to be expected. New Hampshire and Vermont hence look better. Illinois is now worst, it has less states to pull its score downwards.
For completeness sake it should be noted that this calculation was done on the Census Bureau regions. A look on Wikipedia provides more US regions, the Standard Federal Regions may provide a better split for these data. States in Region VIII (Colorado, Montana, North Dakota, South Dakota, Utah, Wyoming) and X (Alaska, Idaho, Oregon, Washington) then have lower posterior incidences. But, since this division was tested because it looked more suitable especially for states in VIII and X, it has an unknown selection bias and hence its results are not presented.

No comments:

Post a Comment

Wiekvoet

Wiekvoet is about R, JAGS, STAN, and any data I have interest in. Topics range from sensometrics, statistics, chemometrics and biostatistics. For comments or suggestions please email me at wiekvoet at xs4all dot nl.