small-worlds/tex/erdos.tex

\section{The Erdős-Rényi model}

\nd Before 1960, graph theory mainly dealt with the properties of specific individual graphs. In the 1960s, Paul Erdős and Alfred Rényi initiated a systematic study of random graphs. Random graph theory is, in fact, not the study of individual graphs, but the study of a statistical ensemble of graphs (or, as mathematicians prefer to call it, a \emph{probability space} of graphs). The ensemble is a class consisting of many different graphs, where each graph has a probability attached to it. A property studied is said to exist with probability $P$ if the total probability of a graph in the ensemble possessing that property is $P$ (or the total fraction of graphs in the ensemble that has this property is $P$). This approach allows the use of probability theory in conjunction with discrete mathematics for studying graph ensembles.  A property is said to exist for a class of graphs if the fraction of graphs in the ensemble which does not have this property is of zero measure. This is usually termed as a property of \emph{almost every (a.e.)} graph. Sometimes the terms “almost surely” or “with high probability” are also used (with the former usually taken to mean that the residual probability vanishes exponentially with the system size). \s


\subsection{Erdős-Rényi graphs}

\nd Two well-studied graph ensembles are $G_{N,M}$, the ensemble of all graphs with $N$ nodes and $M$ edges, and $G_{N,p}$, the ensemble of all graphs with $N$ nodes and probability $p$ of any two nodes being connected. These two families, initially studied by Erdős and Rényi, are known to be similar if $M = \binom{N}{2} p$, so as long $p$ is not too close to $0$ or $1$ they are referred to as ER graphs. \s

\nd An important attribute of a graph is the average degree, i.e., the average number of edges connected to each node. We will denote the degree of the ith node by $k_i$ and the average degree by $ \langle r \rangle $ . $N$-vertex graphs with $\langle k \rangle = O(N^0)$ are called sparse graphs. \s

\nd An interesting characteristic of the ensemble $G_{N,p}$ is that many of its properties have a related threshold function, $p_t(N)$, such that the property exists, in the “thermodynamic limit” of $N \to \infty$  with probability 0 if $p < p_t$ , and with probability $1$ if $p > p_t$ . This phenomenon is the same as the physical concept of a percolation phase transition. \s

\nd Another property is the average path length between any two nodes, which in almost every graph of the ensemble (with $\langle k \rangle > 1$ and finite) is of order $\ln N$ . The small, logarithmic distance is actually the origin of the “small-world” phenomena that characterize networks.


\subsection{Scale-free networks}

\nd The Erdős-Rényi model has traditionally been the dominant subject of study in the field of random graphs. Recently, however, several studies of real-world networks have found that the ER model fails to reproduce many of their observed properties. One of the simplest properties of a network that can be measured directly is the degree distribution, or the fraction P(k) of nodes having k connections (degree $k$). A well-known result for ER networks is that the degree distribution is Poissonian,

\begin{equation}
    P(k) = \frac{e^{z} z^k}{k!}
\end{equation}

\nd Where $z = \langle k \rangle$. is the average degree. \s Direct measurements of the degree distribution for real networks show that the Poisson law does not apply. Rather, often these nets exhibit a scale-free degree distribution:

\begin{equation}
    P(k) = ck^{-\gamma} \quad \text{for} \quad k = m, ... , K
\end{equation}

\nd Where $c \sim (\gamma -1)m^{\gamma - 1}$ is a normalization factor, and $m$ and $K$ are the lower and upper cutoffs for the degree of a node, respectively. The divergence of moments higher then $\lceil \gamma -1 \rceil$ (as  $K \to \infty$ when $N \to \infty$) is responsible for many of the anomalous properties attributed to scale-free networks. \s

\nd All real-world networks are finite and therefore all their moments are finite. The actual value of the cutoff K plays an important role. It may be approximated by noting that the total probability of nodes with $k > K$ is of order $1/N$

\begin{equation}
    \int_K^\infty P(k) dk \sim \frac{1}{N}
\end{equation}

\nd This yields the result

\begin{equation}
    K \sim m N^{1/(\gamma -1)}
\end{equation}

\nd The degree distribution alone is not enough to characterize the network. There are many other quantities, such as the degree-degree correlation (between connected nodes), the spatial correlations, the clustering coefficient, the betweenness or central-ity distribution, and the self-similarity exponents.

\subsection{Diameter and fractal dimension}

Regular lattices can be viewed as networks embedded in Euclidean space, of a well-defined dimension, $d$. This means that $n(r)$, the number of nodes within a distance $r$ from an origin, grows as $n(r) \sim r^d$ (for large $r$). For fractal objects, $d$ in the last relation may be a non-integer and is replaced by the fractal dimension $d_f$ \s

\nd An example of a network where the above power laws are not valid is the Cayley tree (also known as the Bethe lattice). The Cayley tree is a regular graph, of fixed degree $z$, and no loops. An infinite Cayley tree cannot be embedded in a Euclidean space of finite dimensionality. The number of nodes at $l$ is $n(l) \sim (z - 1)^l$ . Since the exponential growth is faster than any power law, Cayley trees are referred to as infinite-dimensional systems. \s

\nd In most random network models, the structure is locally tree-like (since most loops occur only for $n(l) \sim N$), and since the number of nodes grows as $n(l) \sim \langle k - 1 \rangle^l$, they are also infinite dimensional. As a consequence, the diameter of such graphs (i.e., the minimal path between the most distant nodes) scales as $D \sim \ln N$. Many properties of ER networks, including the logarithmic diameter, are also present in Cayley trees. This small diameter in ER graphs and Cayley trees is in contrast to that of finite-dimensional lattices, where $D \sim N^{1/d_l}$. \s

\nd Similar to ER, percolation on infinite-dimensional lattices and the Cayley tree  yields a critical threshold $p_c = 1/(z - 1)$. For $p > p_c$, a “giant cluster” of order $N$ exists, whereas for $p < pc$,only small clusters appear. For infinite-dimensional lattices (similar to ER networks) at criticality, $p =
p_c$ , the giant component is of size $N^{2/3}$. This last result follows from the fact that percolation on lattices in dimension $d \geq d_c = 6$ is in the same universality class as infinite-dimensional percolation, where the fractal dimension of the giant cluster is $d_f = 4$, and therefore the size of the giant cluster scales as $N^{d_f/d_c} = N^{2/3}$. The dimension $d_c$ is called the “upper critical dimension.” Such an upper critical dimension exists not only in percolation phenomena, but also in other physical models, such as in the self-avoiding walk model for polymers and in the Ising model for magnetism; in both these cases $d_c = 4$.

\nd Watts and Strogatz suggested a model that retains the local high clustering of lattices (i.e., the neighbors of a node have a much higher probability of being neighbors than in random graphs) while reducing the diameter to $D \sim \ln N$ . This so-called, “small-world network” is achieved by replacing a fraction $\varphi$ of the links in a regular lattice with random links, to random distant neighbors. \s

\subsection{Random graphs as a model of real networks}

\nd Many natural and man-made systems are networks, i.e., they consist of objects and interactions between them. These include computer networks, in particular the Internet, logical networks, such as links between WWW pages, and email networks, where a link represents the presence of a person's address in another person's address book. Social interactions in populations, work relations, etc. can also be modeled by a network structure. Networks can also describe possible actions or movements of a system in a configuration space (a phase space), and the nearest configurations are connected by a link. All the above examples and many others have a graph structure that can be studied. Many of them have some ordered structure, derived from geographical or geometrical considerations, cluster and group formation, or other specific properties.  However, most of the above networks are far from regular lattices and are much more complex and random in structure. Therefore, it can be assumed (with a lot of precaution) that they maintain many properties of the appropriate random graph model. \s

\nd In many aspects scale-free networks can be regarded as a generalization of ER networks. For large $\gamma$ (usually, for $\gamma > 4$) the properties of scale-free networks, such as distances, optimal paths, and percolation, are the same as in ER networks. In contrast, for $\gamma < 4$, these properties are very different and can be regarded as anomalous. The anomalous behavior of scale-free networks is due to the strong heterogeneity in the degree of the nodes, which breaks the node-to-node translational homogeneity (symmetry) that exists in the classical
homogeneous networks, such as lattices, Cayley trees, and ER graphs. The small variation of the degrees in the ER model or in scale-free networks with large $gamma$ is insufficient to break this symmetry, and therefore many results for ER networks are the same as for Cayley trees, where the degree of each node is the same.