Machine-Generated data

Machine-Generated data

Machines and sensors are routinely used to capture events, measure consumption or record the physical world. The Internet, and technological innovation more generally, has enabled sensors to capture a wide variety of data in a non-disruptive way. Examples include fixed sensors which are in place for security or monitoring purposes, mobile sensors which use GPS technology to capture spatial and temporal trends, and also computer systems. One of the reasons that we are able to gather such vast quantities of data at a high velocity is that we have decreased the role of humans in data generation. It is now easy to generate accurate and precise spatial information from mobile devices — previously georeferencing was expensive and far more complicated to implement. Together, machines generate several different types of new forms of data that can be used to assist decision making in real-time. For instance, “Smart Cities” amalgamate real-time data from a wide range of sensors including CCTV, weather monitors and traffic measuring sensors, in addition to other types of Big Data (such as social network data).

An example of fixed sensors are machines which are used to monitor energy usage. Recently, UK energy companies have begun making the move toward installing smart meters for every property. These devices record energy consumption in real-time enabling the collection of energy trends for addresses at a very high temporal frequency. Most commonly, data are stored in 30 minute intervals. While the primary objective of the data is to improve the efficiency and transparency of billing, they can also reveal interesting geographic trends. For instance, Dugmore (2010) suggests that as occupied households use utilities daily, unoccupied dwellings could be identified by pooling utilities companies' data. The high temporal resolution of the data enables us to segment households based on energy usage throughout the day. Research has found smart meter data to be an effective predictor of household characteristics and a viable alternative to the Census for some measures. For instance, Samson (2014) clustered the standardized average daily energy profiles of smart meters from deprived postcodes. The research identified that most energy profiles fit one of four trends (Figure 9-5). Each of the trends were found to be associated with households of different life stages.

There are also sensors that are unfettered from physical locations. Satellite imagery has historically been a common source of geographic data and there has often been more data than researchers can handle. However, there are now new forms of data that have benefited from GPS technologies. One such example is fitness apps that use mobile devices to record routes taken during exercises (Figure 9-6). These types of apps (and indeed many others) record detailed tracks of sensor movements and can be informative of real-world travel activities. Typically the apps record locations at very regular intervals so that journeys can be mapped and attributes such as elevation and speed can be appended. The proliferation of handheld devices has vastly increased the supply of geospatial data on the population. It is not uncommon for such data to be re-purposed to estimate traffic congestion and the popularity of restaurants and stores.

Figure 9-5 Temporal heat maps of energy usage across a typical day by four clustered energy profile types. Each row is an anonymized smart meter for a unique household. Source: Samson (2014)

Figure 9-6 Running routes recorded by a fitness app in London (30km East West). Source: Lansley and Cheshire (2018)