Typically, we model count data, or integer valued data, with the
gamma-Poisson distribution

Recall that the Poisson distribution is a distribution over integer
values parameterized by \(\lambda\). One interpretation behind
\(\lambda\) is that it parameterizes the rate at which events occur
with a fixed interval, assuming these events occur independently. The
gamma distribution is conjugate to the Poisson distribution, so the
gamma-Poisson distribution allows us to learn both the distribution over
counts and the rate parameter \(\lambda\).

We can map these orderings of dur and educ to produce a crosstab
heatmap of n, numbe of women

plt.figure(figsize=(9,6))ct=pd.crosstab(ceb_int['dur'],ceb_int['educ'],values=ceb_int['n'],aggfunc=np.sum).sort_index(ascending=False)sns.heatmap(ct,annot=True)plt.yticks(ceb_int['dur'].drop_duplicates().values-.5,ceb['dur'].drop_duplicates().values)plt.xticks(ceb_int['educ'].drop_duplicates().values-.5,ceb['educ'].drop_duplicates().values)plt.ylabel('duration of marriage (years)')plt.xlabel('level of education')plt.title('heatmap of marriage duration by level of education')

<matplotlib.text.Text at 0x11a8f48d0>

Since dur and education are ordinal valued, the columns assume a
small number of integer values