Converting data to the Australian Statistical Geography Standard Seminar

This seminar covers the various methods by which data can be converted from the Australian Standard Geographical Classification (ASGC) to the new Australian Statistical Geography Standard (ASGS), their conceptual basis and their advantages and disadvantages. This seminar is of interest to any one who has to deal with the transition to the ASGS, needs to analyse data across the ASGC and ASGS or deal with clients or stakeholders who use statistics on sub-state regions. It assumes a basic awareness of both the ASGC and ASGS.

Converting data to the ASGS can have different levels of impact. The extent to which this can be managed depends on the nature of the data and how it was collected and assigned. The region the data is currently assigned to is one factor that impacts on the ability to convert data to the ASGS.

There can be major impact on data assigned to:

Census Collection Districts (CCDs)

Statistical Local Areas

Statistical Subdivisions

Statistical Divisions

Labour Force Regions.

As the ASGC fades away and the ASGS takes over completely these regions will no longer be maintained or supported with data by the ABS, therefore there will be a major impact on data at these levels.

There is varied impact on data for the following capital cities, as these have changed in the GCCSA structure:

Melbourne

Adelaide

Perth

Brisbane.

Statistically the changes are unlikely to be significant for the following capital cities:

Canberra

Sydney

Hobart

Darwin.

The change to the ASGS has some impact on data assigned to:

Remoteness Areas

Urban Centres and Localities

Indigenous Structures.

The impact on data at these regions should not seriously affect comparability of data over time, but users analysing data for these regions need to be aware of the change as a possible source of anomalous results.

There is little impact on data for Local Government Areas (LGAs). The change however, has some impact on data for other non-ABS structures. In general, these regions are more accurately represented in the ASGS because they are approximated by SA1 (which give a much better geographical resolution than the CCDs). However, users analysing data for these regions need to be aware of the change as a possible source of anomalous results.

Converting data between the ASGC and the ASGS is complex and depends on the nature of the data and the nature of the geographic areas on which the data was compiled.

Time series data in particular is affected by the move to the ASGS, however, it is difficult to generalise where recreating a time series on ASGS regions is effective. It is dependent on the nature of the data and the purpose for which it is intended.

Generally there are two broad options for converting data from the ASGC to the ASGS:

Address coding involves coding the original collection units to the relevant areas of the ASGS and then re-aggregating their associated data. This is only possible where the address or latitude and longitude of the original unit record data is available.

If latitudes and longitudes are available, it is possible to code the units to any level of the ASGS using the digital boundaries and a Geographical Information System (GIS) such as MapInfo or ArcInfo.

If the complete street address is available, it is possible to code the units to any level of the ASGS using address coding software. Such software is available commercially, many state governments have this capability, as does the ABS.

Locality, Postcode and State are all part of an address and used in conjunction can effectively code data to the SA2 level and above in the ASGS. This can be done using a suburb/locality to SA2 index which is available free of charge from the ABS.

There are several issues with using Postcode alone to code data to the ASGS. There is no official geographic definition of Postcodes and they do not cover the whole of Australia. In general Postcodes are larger than suburbs and consequently cannot effectively code data to the more detailed structures of the ASGS. However it is possible to reasonably accurately code data to the SA4 and GCCSA levels using a Postcode to SA4 index. This is also be available free of charge from the ABS.

At present the ABS does not provide an address geocoding service, as we do not own the intellectual property of the G-NAF file which is owned by PSMA Australia. ABS does however have an agreement with PSMA Australia to include our Mesh Block codes in the G-NAF file. This allows G-NAF users to relate addresses to the ABS geography.

There are several commercial organisations that offer geocoding services, based on G-NAF. More details on these companies can be found on the PSMA website: http://www.psma.com.au/data-access/

To view and obtain available coding indexes please refer to the following link: Correspondences

When collecting address information for statistical data it is important to recognise the current addressing standards to maximise the extent to which data can be coded using the G-NAF.

The current addressing standard for Australia and New Zealand is Geographic Information - Rural and Urban Addressing(AS/NZS 4819:2011). Copies are available for purchase from Standards Australia. A direct link to the purchase page is added here:

The addressing standard provides requirements and guidelines that authorities use in assigning addresses, naming roads and localities, and recording and mapping the related information. It outlines the various elements of the system and provides guidelines for the application of those elements to a range of address site types in both urban and rural areas.

Correspondences can be used where the location information of original collection units is not available. They are a mathematical method of reassigning data from one geographic region to a new geographic region.

Correspondences are less reliable than address coding and the results can be misleading in some circumstances. This is because they are based on the assumption that the data to be converted is distributed across the original regions in the same way as the correspondence's weighting. This assumption may or may not be reasonable depending on the circumstance. Correspondences, therefore, need to be used with a great deal of care.

Most correspondences are weighted either by population or by area. Population weighted correspondences are usually more effective for social and demographic data, area weighted correspondences for agricultural and environmental data.

Correspondences are more accurate where the regions on which the data was originally aggregated are smaller than the new regions. For example, corresponding data from Census Collection Districts (CCDs) to SA2s is generally accurate because most CCDs are smaller than SA2s and most CCDs are completely or largely contained within a single SA2.

Allocation tables, or hierarchy tables, are often included in the term correspondence. These reflect the situation where one set of regions is precisely nested within a second. For example, by definition, one or more whole SA1s make up a SA2. Allocations only contain 1 to 1 or many to 1 relationships and are therefore straightforward to use, as they are simple aggregations.

Allocation tables are available under the 'Downloads' of all published ASGS volumes. To view and obtain available allocation tables please follow the links below:

Unless otherwise noted, content on this website is licensed under a Creative Commons Attribution 2.5 Australia Licence together with any terms, conditions and exclusions as set out in the website Copyright notice. For permission to do anything beyond the scope of this licence and copyright terms contact us.