The Household Disability Survey 2013 (HDS 2013) provides information on the prevalence of disability by age, sex, and ethnic group, down to region level. It also provides information on disability rates for specific impairment types, including sensory, physical, intellectual, and psychiatric/ psychological impairment.

What small domain estimates are and why we produce them

While the HDS 2013 had a large sample size, with over 18,000 respondents, it is not large enough to produce reliable estimates for many TAs using our standard methods. We therefore use a technique called ‘small domain estimation’ (SDE) to produce these estimates.

SDE works by using a statistical model to estimate a relationship between the information collected in the census about each TA (eg the demographics of the TA, and the proportion of people in each TA indicating they had a disability) and the measures we are estimating, which were collected in HDS 2013 (eg people with a hearing impairment). We then use this model to estimate the prevalence of each measure within a TA by applying it to the census attributes reported in each TA.

While we take a lot of care to produce the model and the resulting estimates, there is always a chance that a TA will have a different relationship between the census and HDS 2013 variables than the rest of the population. This may not be picked up due to a small sample size.

For example, a particular TA may have a relatively high proportion of people with a hearing impairment because that area has strong support networks. That TA may also have a small sample size, so the SDE relies mainly on the relationship estimated at the national level – between census information and the variable of interest (ie having a hearing impairment). In this situation our approach relies on census information capturing the increase in disability rate through the two census disability questions (ie that a relatively high proportion of people in that TA would tick the response for ‘I have a health problem or condition that causes me difficulty with hearing’). Although we would expect this to be the case, based on the strong relationship estimated by the model, we know this question isn’t answered perfectly – it is possible the estimates would not pick up the increase.

As for our standard survey estimates, we give confidence intervals that estimate a ‘credible interval’ in which we are 95% confident the true value will lie. These assume the underlying model is correct and take into account uncertainty due to the number of people in the sample and the variability in the data.

The sections below give further detail on how we produce the SDEs.

Overview of the model we use

Small domain estimates were produced for the number of people with a:

disability (of any type)

hearing impairment

physical impairment

vision impairment

psychological/psychiatric impairment.

Each measure has a binary outcome (1=Yes, 0=No). A separate model is fit for each measure, though similar explanatory variables and model structure are used in each model.

The model was implemented within a Hierarchical Bayesian framework using the statistical package ‘Stan’ (Gelman et al, 2014; http://mc-stan.org/). This modelling framework allows information to be pooled over TAs so estimates for each TA are informed by the general pattern of variation in the outcome, over variables such as age and sex, as well the data from each TA. Inference was carried out using Markov chain Monte Carlo (MCMC) methods. The MCMC sampling comprised four chains. The number of simulations used in each chain depended on the measure, and ranged from 10,000 to 20,000. We used half the simulations in each chain for initialisation. The remaining simulations were available for inference and thinned to 1,000.

We provide the estimates as both rates and counts. As for other estimates produced from HDS 2013, these estimates cover the usually resident population of New Zealand, living in occupied private dwellings or group homes on the night of the 2013 Census of Population and Dwellings. The estimates do not cover people living in non-private dwellings such as residential facilities, though we may expand the approach to cover these people if this would be useful. Information on the number of people in residential care facilities who have a disability is collected by the 2013 Disability Survey of Residential Facilities.

To give a sense of the impact excluding people in non-private dwellings has on the disability measures contained in this tool, table 1 shows the effect that excluding them has on national level estimates.

Table 1

Age group (years)

Disability type

All

Vision

Hearing

Physical

Include residential facilities

Exclude residential facilities

Include residential facilities

Exclude residential facilities

Include residential facilities

Exclude residential facilities

Include residential facilities

Exclude residential facilities

Percent

0-14

11

11

1

1

1

1

1

1

15-44

16

16

2

2

4

4

7

7

45-64

28

28

5

5

11

11

17

17

65+

56

59

10

11

27

28

46

49

Total

23

24

4

4

8

9

14

14

Source: Stats NZ

Model specification

We produce the TA estimates in two steps.

A model is fit that predicts the probability of each response type given a set of characteristics, such as a person’s age and the TA they associate with.

TA estimates are produced by weighting up the model’s predictions for each combination of the characteristics used in the model, by the proportion that this combination makes up in a particular TA. We calculate the proportions using counts from 2013 Census data (adjusted for census non-response). Because Māori and young adults tend to be under-represented in census counts, the counts are adjusted by 5-year age groups and Māori ethnicity. This brings the census proportions in line with the estimated resident population (Stats NZ 2014) and the target population used in HDS 2013.

The probability of having a disability measure is modelled as

Pr(Y=1) = logit-1 (μ + Xβ + αj + γs)

where

for a probability, π, logit (π) = log (π / (1−π))

αj represent the TA random effects for TA j, and are assigned a prior distribution of αj ∼ Normal(0,σta)

γs represents the strata random effect for strata s, and are assigned a prior distribution of γs ∼ Normal(0,σstrata)

the parameters σstrata and σta are given prior distributions of Uniform(0,2)

X are person level attributes from the Census (sex, household size, whether the person indicated on their Census form that they had a disability, and whether the person was of Māori ethnicity), and are assigned independent prior distributions of β ∼ Normal(0,2) – no shrinkage occurs across these groups of parameters and they can be viewed as ‘fixed effects’.

We used 41 strata, which reflect the strata used in the sample design of HDS 2013. The strata are defined by regional council area, whether the PSU is urban or rural, and whether the PSU has a high or low proportion of Māori.

The number of people eligible for selection within a census night dwelling had an upper limit of eight, due to the small number of dwellings containing more than eight eligible people.

Disability questions included in the census

Two questions were included in the 2013 Census.

Q16: Does a health problem or a condition you have (lasting six months or more) cause you difficulty with, or stop you from:

seeing, even when wearing glasses or contact lenses

hearing, even when using a hearing aid

walking, lifting, or bending

using your hands to hold, grasp, or use objects

learning, concentrating, or remembering

communicating, mixing with others, or socialising, or

no difficulty with any of these.

Q17: Do you have a long-term disability (lasting six months or more) that stops you from doing everyday things other people can do?

Responses to these questions were used to define each individuals ‘census disability status’, a key variable in the HDS 2013 sample selection and in the model used to produce these estimates.

Model selection

Variables included in the model were: census disability status, age, strata, Māori/non-Māori, and the number of eligible household members. They all were included in the model because they have a strong relationship to the chance a person was selected into the HDS 2013 sample and responded. Most of these variables were also significant factors in explaining each disability measure, though for household size and sex the effects were small.

The largest effects in the model, across all the disability measures, were for the census disability status and age. There was a strong interaction between these two effects. Apart from these two variables, including interaction terms between the main effects did not improve the model.

See below for more discussion on modelling the age and census disability status.

The census disability status captures whether someone has a disability (Yes/No), but does not capture details about what type of disability they are likely to have. We also looked at whether including more detailed information that was captured in the census questions would improve the model. For the hearing and vision impairment measures the related census responses in question 16 (response options 1 and 2, respectively) had small but significant effects. We included these effects in the final model used for those measures. Adding more detailed census disability responses did not improve the model for the other measures.

Age effects

We looked at two broad options for age and its interaction with census disability status.

The first approach split the continuous age variable into discrete age groups. Indicator variables for each age group crossed with census disability status were then included in the model. We used two-year age groups up to age 10 years, then five-year groupings.

The second approach used a natural cubic spline, with separate splines used for people who indicated in the census that they had a disability, and those who didn’t.

Both approaches produced similar results. We decided to use the second approach as it supports producing estimates for any age breakdown (eg <18 years) rather than being restricted to the age groups included in the model. It also provides flexibility in how smoothly to model the age effects.

When fitting the spline we needed to choose the degrees of freedom to use, and where the resulting breakpoints should be placed. To make this decision we fit a range of models with degrees of freedom ranging from 5 to 20, and breakpoints spread uniformly across the age distribution, and then compared the model fit of these competing models. To assess model fit we used the 20 Fold Cross Validation Error alongside the AIC. We did this initial model assessment in [R] using the ‘splines’ and ‘gamm4’ libraries. For simplicity the models’ fit at this point excluded the random effects for TA and strata.

Across the four measures, splines with degrees of freedom between 7 and 10 performed well. We used these models for model fitting in Stan and further analysis. The choice in spline gave small differences in the resulting estimates for some age groups. Because of this we produced the final estimates and confidence intervals by averaging over the posterior distribution of the two models. Because the Cross Validation Error and AIC measures were very similar, we gave equal weight to each model. This technique is a simple approximation of Bayesian Model Averaging and produces a wider confidence interval than selecting a single model because it captures some of the uncertainty due to model selection. Ideally we would have used this approach across a wider range of competing models; however, the differences were small enough that this did not seem worthwhile.

Model checking

This section gives an overview of the model checking done when producing these estimates.

Comparing estimates produced by the SDE model with standard survey estimates for large domains

For large domains the estimates and the measure of uncertainty should be very similar between the two methods. We did these checks by five-year age groups, Māori, sex, strata, region, and census disability status. Checking against the standard survey estimates is useful to ensure we haven’t missed an important aspect of the weighting methodology, and to identify any parts of the population the model may not fit well.

Posterior predictive checks

We used repeated draws from the posterior distribution to simulate the observed sample. The distribution of simulated TA estimates was compared with the observed sample. No systematic issues, such as over-shrinkage (where the estimates look to be pulled too close to the overall average to support the variation observed in the sample) were identified in this checking.

Sensitivity analysis

We investigated the sensitivity of estimates to aspects of model specification. Ideally the resulting estimates should be robust to manage reasonable changes in the model specification.

Sensitivity testing included:

different approaches to modelling age (as described above)

modelling TA estimates with a Student-t distribution with the degree of freedom set to 3, rather than a normal distribution. Using a Student t distribution with low degrees of freedom provides a flatter distribution of TA effects, which would more easily accommodate a TA with an extreme effect. Using a Student-t instead of a Normal distribution produced little difference and we chose to keep with a Normal distribution for TA effects

fitting a wider prior distribution of Uniform(0,10) for , , as well as rather than . This slowed convergence of the MCMC simulation but had little impact on the resulting estimates

Apart from the different approaches to modelling age, changes to the modelling assumptions did not result in substantive differences in the resulting model and estimates.

Issues we identified during the model checking

We identified two issues during the model checking.

The first was that while the SDE model matched unweighted survey estimates for Māori and non-Māori well, it did not match the weighted estimates produced using our standard weighting methodology as well as we would have liked. However, there was substantial overlap in the confidence intervals between the two estimates. There are a number of reasons this can occur, including that the SDEs use much more detailed census information than is used to form the standard weights, and also that the model does not mirror the process for producing the initial selection weights perfectly.

To ensure consistency with survey estimates and minimise any chance of bias in the SDE estimates, we benchmarked those for Māori/non-Māori to match the survey estimates.

The approach we used to do this is outlined in appendix C of Bryant et al (2016). This approach benchmarks the posterior distribution used to produce SDEs to random draws from a normal distribution centred on the standard survey point estimates, with standard deviation equal to the survey standard errors. The margins we used in this situation were Māori and non-Māori, though the approach could be used more widely. The benefit of this approach is that the uncertainty associated with the survey estimates is considered.

The benchmarking had the largest effect on the overall disability measure – estimates for Māori were increased by a factor of 1.09, and the overall estimates by a factor of 1.02.

The second issue we identified was that for the vision impairment and the hearing impairment measures the SDE model overstated the national estimates for the 0-4 age group compared to the estimates we produced using our standard survey weighting. The disability rates for these measures are small, typically varying between 1 percent and 2 percent. While the difference between the two approaches was small, we decided to suppress these estimates as we were not happy with how the model captured this age group.

How we take the sample design into account

Two important aspects need to be taken into account when forming SDEs:

some population groups are over-represented by the respondents in the sample

the different stages of sample selection.

Some groups are over/under-represented in the sample’s respondents

Over/under-representation is due to the sample design being more heavily targeted at some parts of the population (eg people from smaller regions). Also, some parts of the selected sample are less able to be contacted and to respond to the survey.

To ensure the SDEs represent the population, the variables used to target the sample design are included in the model, as along with key variables associated with how likely a person is to respond to the survey (eg age, sex, strata). As described above, the model predictions are then weighted up – using data from the 2013 Census on each TA’s population characteristics.

Different stages of sample selection

Another aspect of the sample design is that people are not sampled directly, but through different stages of selection.

The first selection stage consists of selecting areas that contain around 60 dwellings. These are often called primary sampling units (PSUs). The second stage consists of selecting people from within the selected PSUs. The third stage of selection consists of randomly removing people from the sample to ensure a maximum of one person is selected from each household.

Because people who live within the same PSU tend to have similar characteristics, this type of design usually leads to larger sampling errors than if people were selected directly. In our initial model fitting, this part of the design was considered by (i) including a random effect for PSUs in the model, (ii) integrating this effect from the posterior distribution to generate the marginal effects of the model’s other characteristics.

In the initial models that we fit, the effect of PSU clustering on the estimates of uncertainty was very small – increasing the width of confidence intervals by between 1 and 5 percent. Including, and then integrating out, the PSU effects substantially increases the time required to fit each model; we therefore decided to exclude PSU effects from the model fitting so we could fit and compare a wider range of models.

Once we selected the final model, we made a design-based adjustment to the resulting prediction intervals to account for the effect of PSU clustering on the precision.

A third selection stage consists of sub-sampling people selected within the same household. Because of this sub-sampling, people in larger households have a smaller chance of being selected. The number of people eligible for the sample is therefore included in the model. We made no attempt to adjust for any sampling efficiency gained due to household members being similar to each other. This may cause the resulting confidence intervals to be wider than necessary, but the effect is expected to be small.