The 21st Century: The Age of Connectivity

Every day we’re constantly creating, sending, and receiving data. Data about people, how they interact, how they’re related, etc. What not everyone might realize is that math is more than just theoretical curiosity, and can actually help you better understand entities like a social network or a set of scientific collaborations. In reality, there is a whole branch of Mathematics devoted to study these systems called Graph Theory or Network Science. It is a classical field that started in the 18th century and is getting more and more important every day. A data scientist can combine this knowledge with technologies that give us access to network data sets and actually start to look at how we are all connected.

Network science provides a number of algorithms that can be used to characterize a network and extract significant information from it. For example, in social networks we can identify central users, influencers, and communities. In a network of transactions, pattern recognition can be used to locate suspects of money laundering and cycle detection might indicate tax evasion. In an items-for-sale network we can automatically find clusters of items for intelligent advertising and marketing. Clearly, the combination of mathematics and machine learning is the key to unravel complex networks.

Those aforementioned algorithms usually allow the presence of node and link attributes. For instance, a numerical and positive attribute for the links can represent the strength of the relationship, which definitely influences the centrality values of the nodes.

The Fruchterman-Reingold layout shows central nodes (circled in red) and bridges (circled in green) in a social network. Image obtained with BeGraph.

Network analysis requires specialized software for optimal efficiency, because of networks’ sparse nature and the poor scaling of most algorithms with larger networks. Besides pure mathematical algorithms, visualization plays a crucial role in this analysis. One of the first things one must do with a new network is to find an appropriate spatial distribution (layout) to be able to properly explore it visually to detect key nodes and clusters. When the network becomes larger than hundreds of thousands of nodes and millions of links, this can become a challenging problem to compute. But in the 21st century, we have access to powerful CPUs and GPUs that can deal with these kinds of tasks!

In summary, don’t miss out on the opportunities that network science can offer, the fields of applications are endless and you can obtain unique and valuable information. If network science sounds new and mysterious to you, take a look at the introductory book by A. L. Barábási. If you’re feeling adventurpus, check out the more advanced and algorithm-oriented classic, Networks by M. E. J. Newman.