BIG WORLD, BIG DATA

June 28, 2018

The number of potential applications for the use of big data is immense. Initially intended as a private sector tool, big data is now finding its place within the realm of politics. Cambridge Analytica’s involvement in the Trump and Brexit campaigns has demonstrated the onset of a new era where big data may be used not only for population analysis, but also to influence the political views and preferences of the population as well.

The evolution of technology and the use of big data has forcefully shifted the balance of power relations within society. It is no longer the person who watches the algorithm, but rather the algorithm watching the person [2]. The main features of big data – volume, velocity, and variety – create a very appealing tool as it allows for the discernment of patterns and relationships that are not readily evident from the input data itself.

Big data is increasing “situational awareness” by recording trends that are taking place. This is often used by major supermarket chains such as Wal-Mart, which handles more than a million customer transactions every hour [4]. For example, customer buying behaviour records can demonstrate if the person is conservative, or if they are prone to shifting preferences based on prices, branding, and other factors. Nevertheless, one must be aware that big data can only show event correlation and cannot concretely explain causation.

Due to the corporate-centric nature of big data collection, this sector is where it will be deployed. Big data is an essential tool for detecting bank fraud; should a transaction deviate from the customer’s normal buying patterns, the bank is able to block the activity immediately [5]. But contrary to commercial application, deployment of big data analysis “for the public good” has not been widespread. One place big data could have been useful was the 2007 mortgage crisis in the United States, which began the world financial crisis of 2008. Had big data analysis been performed in relation to debt securities, the bubble may have been halted at its inception.

This is where the limitations of big data analysis become obvious though. The first issue is the amount of data available for algorithmic consumption. The predictive power of big data can only be strengthened by a “significant number of known instances of a particular behaviour” [6]. This means that while bank fraud is a common and well-researched problem with a distinguished pattern, unprecedented crises like the mortgage bubble are not easily predictable.

Another limitation comes from the creation of the algorithm itself. Consumption of an “example data” set creates the operation with the task of finding correlations in the data [7]. Data, which is separate from the example set, is then used to test the effectiveness of the resulting algorithm. This can sometimes create an algorithm that is efficient at forecasting based on the sample used to create it, but is still inadequate for classification of new test data.

While there is a significant risk of result politicization – where the data expert will find scenarios they were initially hoping to find – the fast expansion of available data sets and their dynamic nature makes big data analysis a very powerful tool for business and research.