Many complex systems display a surprising degree of tolerance against errors. For example, relatively simple organisms grow, persist and reproduce despite drastic pharmaceutical or environmental interventions, an error tolerance attributed to the robustness of the underlying metabolic network [1]. Complex communication networks [2] display a surprising degree of robustness: while key components regularly malfunction, local failures rarely lead to the loss of the global information-carrying ability of the network. The stability of these and other complex systems is often attributed to the redundant wiring of the functional web defined by the systems’ components. In this paper we demonstrate that error tolerance is not shared by all redundant systems, but it is displayed only by a class of inhomogeneously wired networks, called scale-free networks. We find that scale-free networks, describing a number of systems, such as the World Wide Web (www) [3–5], Internet [6], social networks [7] or a cell [8], display an unexpected degree of robustness, the ability of their nodes to communicate being unaffected by even unrealistically high failure rates. However,

by
Pedro Domingos, Matt Richardson
- In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining, 2002

"... One of the major applications of data mining is in helping companies determine which potential customers to market to. If the expected pro t from a customer is greater than the cost of marketing to her, the marketing action for that customer is executed. So far, work in this area has considered only ..."

One of the major applications of data mining is in helping companies determine which potential customers to market to. If the expected pro t from a customer is greater than the cost of marketing to her, the marketing action for that customer is executed. So far, work in this area has considered only the intrinsic value of the customer (i.e, the expected pro t from sales to her). We propose to model also the customer&apos;s network value: the expected pro t from sales to other customers she may inuence to buy, the customers those may inuence, and so on recursively. Instead of viewing a market as a set of independent entities, we view it as a social network and model it as a Markov random eld. We show the advantages of this approach using a social network mined from a collaborative ltering database. Marketing that exploits the network value of customers|also known as viral marketing|can be extremely eective, but is still a black art. Our work can be viewed as a step towards providing a more solid foundation for it, taking advantage of the availability of large relevant databases. Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications| data mining

...s of these approaches, and the discovery of widespread network topologies with nontrivial properties [42], has led to asurry of research on modeling the Web as a semi-random graph (e.g., Kumar et al. =-=[28-=-], Barabasi et al. [1]). Some of this work might be applicable in our context. In retrospect, the earliest sign of the potential of viral marketing was perhaps the classic paper by Milgram [31] estima...

"... We review the recent fast progress in statistical physics of evolving networks. Interest has focused mainly on the structural properties of random complex networks in communications, biology, social sciences and economics. A number of giant artificial networks of such a kind came into existence rece ..."

We review the recent fast progress in statistical physics of evolving networks. Interest has focused mainly on the structural properties of random complex networks in communications, biology, social sciences and economics. A number of giant artificial networks of such a kind came into existence recently. This opens a wide field for the study of their topology, evolution, and complex processes occurring in them. Such networks possess a rich set of scaling properties. A number of them are scale-free and show striking resilience against random breakdowns. In spite of large sizes of these networks, the distances between most their vertices are short — a feature known as the “smallworld” effect. We discuss how growing networks self-organize into scale-free structures and the role of the mechanism of preferential linking. We consider the topological and structural properties of evolving networks, and percolation in these networks. We present a number of models demonstrating the main features of evolving networks and discuss current approaches for their simulation and analytical study. Applications of the general results to particular networks in Nature are discussed. We demonstrate the generic connections of the network growth processes with the general problems

...s”, to provide possibility of an efficient specialized search [10,84,87,88,90,91,113–116]. authorities hubs FIG. 7. A bipartite directed sub-graph in the Web being used for indexing cyber-communities =-=[90,91]-=-. Natural objects for such indexing are specific bipartite sub-graphs (see Fig. 7) [90,91]. One should note that the directed graphs of this kind have a different structure than the bipartite graphs d...

"... We propose a random graph model which is a special case of sparse random graphs with given degree sequences. This model involves only a small number of parameters, called logsize and log-log growth rate. These parameters capture some universal characteristics of massive graphs. Furthermore, from the ..."

We propose a random graph model which is a special case of sparse random graphs with given degree sequences. This model involves only a small number of parameters, called logsize and log-log growth rate. These parameters capture some universal characteristics of massive graphs. Furthermore, from these parameters, various properties of the graph can be derived. For example, for certain ranges of the parameters, we will compute the expected distribution of the sizes of the connected components which almost surely occur with high probability. We will illustrate the consistency of our model with the behavior of some massive graphs derived from data in telecommunications. We will also discuss the threshold function, the giant component, and the evolution of random graphs in this model.

"... Viral marketing takes advantage of networks of influence among customers to inexpensively achieve large changes in behavior. Our research seeks to put it on a firmer footing by mining these networks from data, building probabilistic models of them, and using these models to choose the best viral mar ..."

Viral marketing takes advantage of networks of influence among customers to inexpensively achieve large changes in behavior. Our research seeks to put it on a firmer footing by mining these networks from data, building probabilistic models of them, and using these models to choose the best viral marketing plan. Knowledge-sharing sites, where customers review products and advise each other, are a fertile source for this type of data mining. In this paper we extend our previous techniques, achieving a large reduction in computational cost, and apply them to data from a knowledge-sharing site. We optimize the amount of marketing funds spent on each customer, rather than just making a binary decision on whether to market to him. We take into account the fact that knowledge of the network is partial, and that gathering that knowledge can itself have a cost. Our results show the robustness and utility of our approach.

... trust. Interestingly, social networks, the World-Wide Web, and many naturally occurring networks all exhibit Zipfian, or “scale free” characteristics, and have been the topic of much recent resea=-=rch [17]-=- [1]. Social networks have been the object of much research. One classic paper is that by Milgram [20], which estimated that every person in the world is only six acquaintances away from every other. ...

"... The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph m ..."

The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph models differ from the traditional Gn;p models in two ways: 1. Independently chosen edges do not result in the statistics (degree distributions, clique multitudes) observed on the web. Thus, edges in our model are statistically dependent on each other. 2. Our model introduces new vertices in the graph as time evolves. This captures the fact that the web is changing with time. Our results are two fold: we show that graphs generated using our model exhibit the statistics observed on the web graph, and additionally, that natural graph models proposed earlier do not exhibit them. This remains true even when these earlier models are generalized to account for the arrival of vertices over time. In particular, the sparse random graphs in our models exhibit properties that do not arise in far denser random graphs generated by Erdos-R&apos;enyi models.

... model should be adapted to capture the notion of an evolving graph. Motivations for modeling the web graph. 1. Many problems we wish to solve on the web (such as the subgraph enumeration problems of =-=[12]-=-) are computationally difficult for general graphs. Nevertheless, a suitable model of the web can help us design and analyze algorithms that work well in practice. They could also be simulated under t...

"... How does a ‘normal ’ computer (or social) network look like? How can we spot ‘abnormal ’ sub-networks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal money-laundering rings), forecasting, and simulations (“how will a computer ..."

How does a ‘normal ’ computer (or social) network look like? How can we spot ‘abnormal ’ sub-networks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal money-laundering rings), forecasting, and simulations (“how will a computer virus spread?”). The heart of the problem is finding the properties of real graphs that seem to persist over multiple disciplines. We list such “laws ” and, more importantly, we propose a simple, parsimonious model, the “recursive matrix ” (R-MAT) model, which can quickly generate realistic graphs, capturing the essence of each graph in only a few parameters. Contrary to existing generators, our model can trivially generate weighted, directed and bipartite graphs; it subsumes the celebrated Erdős-Rényi model as a special case; it can match the power law behaviors, as well as the deviations from them (like the “winner does not take it all ” model of Pennock et al. [21]). We present results on multiple, large real graphs, where we show that our parameter fitting algorithm (AutoMAT-fast) fits them very well. 1

...ved deviations from power-laws for the Web graph, which are well-modeled by the truncated, discretized lognormal (“DGX”) distribution of Bi et al. [4]. Graphs also exhibit a strong “community” effect =-=[11, 14]-=-. Most real graphs have surprisingly small diameters: the “six degrees of separation” for the social network, 19 for the Web, and small values for the Internet AS graph [2, 23]. Apart from these, ther...

"... The pages and hyperlinks of the World-Wide Web maybe viewed as nodes and edges in a directed graph. This graph has about a billion nodes today,several billion links, and appears to grow exponentially with time. There are many reasons---mathematical, sociological, and commercial---for studying the e ..."

The pages and hyperlinks of the World-Wide Web maybe viewed as nodes and edges in a directed graph. This graph has about a billion nodes today,several billion links, and appears to grow exponentially with time. There are many reasons---mathematical, sociological, and commercial---for studying the evolution of this graph. We first review a set of algorithms that operate on the Web graph, addressing problems from Web search, automatic community discovery, and classification. We then recall a number of measurements and properties of the Web graph. Noting that traditional random graph models do not explain these observations, we propose a new family of random graph models.

...of complete bipartite subgraphs, as has been observed in the WWW graph, whereas several other models do not. This (and the linear growth variants model) has the similar drawback as the first model in =-=[27]-=-. The out-degree of every vertex is always a constant. Edges and vertices in the exponential growth copying model increase exponentially. Aiello et al. described a general random graph evolution proce...

"... How can we find communities in dynamic networks of social interactions, such as who calls whom, who emails whom, or who sells to whom? How can we spot discontinuity time-points in such streams of graphs, in an on-line, any-time fashion? We propose GraphScope, that addresses both prob-lems, using inf ..."

How can we find communities in dynamic networks of social interactions, such as who calls whom, who emails whom, or who sells to whom? How can we spot discontinuity time-points in such streams of graphs, in an on-line, any-time fashion? We propose GraphScope, that addresses both prob-lems, using information theoretic principles. Contrary to the majority of earlier methods, it needs no user-defined param-eters. Moreover, it is designed to operate on large graphs, in a streaming fashion. We demonstrate the efficiency and effectiveness of our GraphScope on real datasets from sev-eral diverse domains. In all cases it produces meaningful time-evolving patterns that agree with human intuition.

...phs Graph mining has been a very active area in the data mining community. From the exploratory aspect, Faloutsos et al. [10] have shown the power-law distribution on the Internet graph. Kumar et al. =-=[14]-=- discovered the bow-tie model for web graphs. From the algorithmic aspect, graph partitioning has attracted much interest, with prevailing methods being METIS [12] and spectral partitioning [16]. Even...