Large Graph Processing

I am writing this post to share the slides of the seminar I gave today
here at University of Pisa about Large Graph Processing. The abstract is
the following:

The main topic of this seminar is large graph processing. We will try to
understand what are the problems that arise when dealing with large
datasets and which are the most used solutions today. Specifically we
will introduce the Bulk Synchronous Parallel (BSP) model that allows us
to process extremely large graphs by distributing the computation to
cluster of machines. We will then move our attention to graph
compression. We will survey two interesting graph compression
techniques, namely K2-trees and WebGraph. The former approach exploits
the sparsity of adjacency matrices to achieve compression through a
recursive spatial decomposition. The latter technique look for
repetitions in the adjacency list in order to achieve a good compression
ratio and fast decompression. At the end we will briefly discuss how the
WebGraph compression algorithm coupled with HyperLogLog counters was
used to obtain the famous “4 degree of separation” in the Facebook
graph.