As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which they lived. Survey 'weights' are values which indicate how many population units are represented by the sample unit.

There is one weight provided: a person weight (FINPWT on the Basic and Expanded CURFs and SPFWT0 on the Basic CURF (International Comparison version)). This should be used when analysing the record level data.

Where estimates are derived, it is essential that they are calculated by adding the weights of persons in each category, and not just by counting the number of records falling into each category. If each person's 'weight' were to be ignored, then no account would be taken of a person's chance of selection in the survey or of different response rates across population groups, with the result that counts produced could be seriously biased. The application of weights ensures that the person estimates conform to an independently estimated distribution of the population by age, sex, state/territory, part of state, labour force status and highest educational attainment.

STANDARD ERRORS FOR ESTIMATES WITHOUT PLAUSIBLE VALUES

Each recordalso contains 60 replicate weights and, by using these weights, it is possible to calculate standard errors for weighted estimates produced from the microdata. This method is known as the 60 group Jack-knife variance estimator.

Under the Jack-knife method of replicate weighting, weights were derived as follows:

60 replicate groups were formed with each group formed to mirror the overall sample (where units from a collection district all belong to the same replicate group and a unit can belong to only one replicate group)

one replicate group was dropped from the file and then the remaining records were weighted in the same manner as for the full sample

records in the group that were dropped received a weight of zero.

This process was repeated for each replicate group (i.e. a total of 60 times). Ultimately each record had 60 replicate weights attached to it with one of these being the zero weight (WRPWT01 - WRPWT60 on the Basic and Expanded CURFs and SPFWT1-60 on the Basic CURF (International Comparison version)).

Replicate weights enable variances of estimates to be calculated relatively simply. They also enable unit record analyses such as chi-square and logistic regression to be conducted which take into account the sample design. Replicate weights for any variable of interest can be calculated from the 60 replicate groups, giving 60 replicate estimates. The distribution of this set of replicate estimates, in conjunction with the full sample estimate (based on the general weight) is then used to approximate the variance of the full sample.

To obtain the standard error (SE) of a weighted estimate y, the same estimate is calculated using each of the 60 replicate weights. The variability between these replicate estimates (denoted y(g) for group number g) is used to measure the standard error of the original weighted estimate y using the formula:

Use of the 60 group Jack-knife method for complex estimates, such as regression parameters from a statistical model, is not straightforward and may not be appropriate. The method as described does not apply to investigations where survey weights are not used, such as in unweighted statistical modelling.

STANDARD ERRORS FOR ESTIMATES WITH PLAUSIBLE VALUES

In order to minimise respondent burden, the three skill domains of literacy, numeracy and problem solving in technology rich environments were not directly assessed for each respondent. PIAAC used a matrix-sampling design to assign the assessment exercises to individuals so that a comprehensive picture of achievements in each skill domain across the country could be assembled from the components completed by each individual. PIAAC relied on Item Response Theory scaling to combine the individual responses to provide accurate estimates of achievement in the population. With this approach, however, aggregations of individuals values can lead to biased estimates of population characteristics. To address this, the PIAAC scaling procedures also used a multiple imputation or "plausible values" methodology to obtain data for all individuals, even though each individual responded to only a part of the assessment item pool. By using all available data, ten "plausible values" were generated for each respondent for each of the three domains of literacy, numeracy and problem solvingin technology rich environments.

For each domain, proficiency is measured on a scale ranging from 0 to 500 points. Each person's score denotes a point at which they have a 67 per cent chance of successfully completing tasks with a similar level of difficulty. To facilitate analysis, these continuous values have been grouped into 6 skill levels for Literacy and Numeracy with 'Below Level 1' being the lowest measured level. The levels indicate specific sets of abilities, and therefore, the thresholds for the levels are not equidistant. As a result, the ranges of values in each level are not identical. The relatively small proportions of respondents who actually reached Level 5 often resulted in unreliable estimates of the number of people at this level. For this reason, whenever results are presented in the main report by proficiency level, Levels 4 and 5 are combined. Further information about the Plausible Values and definitions of the three domains can be found inScores and skill levelsof the publicationProgramme for the International Assessment of Adult Competencies (PIAAC), 2011-12(cat. no. 4228.0).

Each record contains the ten plausible values for each of the three domains.

For simple point estimates in any of the domains, it is sufficient to use one of the corresponding ten plausible values (e.g. PVLIT1 for the literacy domain), chosen at random to derive population estimates. If this method is chosen the standard error of the plausible score can be calculated using the formula for standard error for estimates without plausible values, as shown above.

However, a more robust estimate can be obtained by using all ten plausible values in combination. For example in order to report an estimate of the total number of people at Level 1 for literacy, first calculate the weighted estimate of the number of respondents at Level 1 for each of the ten plausible values for literacy (PVLIT1-PVLIT10) individually. Next sum the ten weighted estimates obtained. Then divide the result by ten to obtain the estimate of the total number of people at Level 1 for literacy. The process must then be repeated for each skill level.

Furthermore, when producing estimates by other variables available on the file the process must be performed for each skill level by each category of the by variable(s). For example in order to report an estimate of the total number of males at Level 1 for literacy, first calculate the weighted estimate of the number of males at Level 1 for each of the ten plausible values for literacy (PVLIT1-PVLIT10) individually. Next sum the ten weighted estimates obtained. Then divide the result by ten to obtain the estimate of the total number of males at Level 1 for literacy. The process must then be repeated for each skill level.

Due to the use of multiple possible exercises and the application of plausible scoring methodology, the PIAAC plausible values also include significant imputation variability. The effect of the plausible values methodology on the estimation can be reliably estimated and is included in the calculated SEs. An accepted procedure for estimating the imputation variance using plausible values is to measure the variance of the plausible values (with an appropriate scaling factor) as follows:

where:

Together, the sampling variance and imputation variance can be added to provide a suitable measure of the total variance for the estimate as follows:

where:

The total SE can be then obtained as the square root of the total variance. This SE indicates the extent to which the estimate might have varied by chance because only a sample of persons was included, and because of the significant imputation used in the literacy scaling procedures.

The total Relative Standard Error (RSE), can then be obtained by expressing the total SE as a percentage of the estimate to which it relates:

NOT APPLICABLE CATEGORIES

Some data items included in the microdata include a 'Not applicable' category. The classification value of the 'Not applicable' category, where relevant, is shown in the data item lists in the Downloads tab. In order to comply with the scheme used by other countries participating in PIAAC, the following classification scheme was used to describe 'Not applicable' categories:

Valid skip - respondent was sequenced past the question as the question was not appropriate to them on the basis of information previously provided (note that this category was also assigned to missing values for part or full non-responding records)

Don't know - respondent didn't know the answer to the question

Refused - respondent refused to answer the question

Not stated or inferred - answer to the question could not be determined.

POPULATIONS

The population relevant to each data item is identified in the data item list and should be borne in mind when extracting and analysing data from the CURFs. The actual population count for each data item is equal to the total cumulative frequency minus the 'Valid skip' category.

Generally all populations, including very specific populations, can be 'filtered' using other relevant data items. For example, if the population of interest is 'Employed persons' the Labour Force status data item CD05 on the Basic CURF can be used by applying the filter CD05 = 1. For the same population of interest on the Expanded CURF the dat item LFSAUS can be used by applying the filter LFSAUS=1.