You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

16971 lines
688 KiB
Plaintext

{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2\n",
"\n",
"import os\n",
"import wget\n",
"import zipfile\n",
"import numpy as np\n",
"import pandas as pd\n",
"import networkx as nx\n",
"import plotly.graph_objects as go\n",
"from utils import *\n",
"from collections import Counter\n",
"from tqdm import tqdm\n",
"import time\n",
"import geopandas as gpd\n",
"import multiprocessing\n",
"\n",
"# ignore warnings\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction\n",
"\n",
"## Graph Theory and notations\n",
"\n",
"SCRIVERE\n",
"\n",
"## Aim of the project\n",
"\n",
"SCRIVERE\n",
"\n",
"<!-- Given a social network, which of its nodes are more central? This question has been asked many times in sociology, psychology and computer science, and a whole plethora of centrality measures (a.k.a. centrality indices, or rankings) were proposed to account for the importance of the nodes of a network. \n",
"\n",
"These networks, typically generated directly or indirectly by human activity and interaction (and therefore hereafter dubbed social”), appear in a large variety of contexts and often exhibit a surprisingly similar structure. One of the most important notions that researchers have been trying to capture in such networks is “node centrality”: ideally, every node (often representing an individual) has some degree of influence or importance within the social domain under consideration, and one expects such importance to surface in the structure of the social network; centrality is a quantitative measure that aims at revealing the importance of a node.\n",
"\n",
"Among the types of centrality that have been considered in the literature, many have to do with distances between nodes. Take, for instance, a node in an undirected connected network: if the sum of distances to all other nodes is large, the node under consideration is peripheral; this is the starting point to define Bavelas's closeness centrality \\cite{closeness}, which is the reciprocal of peripherality (i.e., the reciprocal of the sum of distances to all other nodes). \n",
"\n",
"The role played by shortest paths is justified by one of the most well-known features of complex networks, the so-called small-world phenomenon. A small-world network is a graph where the average distance between nodes is logarithmic in the size of the network, whereas the clustering coefficient is larger (that is, neighborhoods tend to be denser) than in a random Erdős-Rényi graph with the same size and average distance. The fact that social networks (whether electronically mediated or not) exhibit the small-world property is known at least since Milgram's famous experiment and is arguably the most popular of all features of complex networks. For instance, the average distance of the Facebook graph was recently established to be just $4.74$. -->"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Background theory: The Erdős-Rényi model\n",
"\n",
"<!-- Before 1960, graph theory mainly dealt with the properties of specific individual graphs. In the 1960s, Paul Erdős and Alfred Rényi initiated a systematic study of random graphs. Random graph theory is, in fact, not the study of individual graphs, but the study of a statistical ensemble of graphs (or, as mathematicians prefer to call it, a probability space of graphs). The ensemble is a class consisting of many different graphs, where each graph has a probability attached to it. A property studied is said to exist with probability $P$ if the total probability of a graph in the ensemble possessing that property is $P$ (or the total fraction of graphs in the ensemble that has this property is $P$). This approach allows the use of probability theory in conjunction with discrete mathematics for studying graph ensembles. A property is said to exist for a class of graphs if the fraction of graphs in the ensemble which does not have this property is of zero measure. This is usually termed as a property of \\emph{almost every (a.e.)} graph. Sometimes the terms “almost surely” or “with high probability” are also used (with the former usually taken to mean that the residual probability vanishes exponentially with the system size). -->\n",
"\n",
"Prior to the 1960s, graph theory primarily focused on the characteristics of individual graphs. In the 1960s, Paul Erdős and Alfred Rényi introduced a systematic approach to studying random graphs, which involves analyzing a collection, or ensemble, of many different graphs. Each graph in the ensemble is assigned a probability, and a property is said to hold with probability $P$ if the total probability of the graphs in the ensemble possessing that property is $P$, or if the fraction of graphs in the ensemble with the property is $P$. This method allows for the application of probability theory in conjunction with discrete math to study ensembles of graphs. A property is considered to hold for a class of graphs if the fraction of graphs in the ensemble without the property has zero measure, which is typically referred to as being true for \"almost every\" graph in the ensemble. The terms \"almost surely\" and \"with high probability\" may also be used, with the former generally indicating that the residual probability decreases exponentially with the size of the system\n",
"\n",
"\n",
"## Erdős-Rényi graphs\n",
"\n",
"<!-- Two well-studied graph ensembles are $G_{N,M}$, the ensemble of all graphs with $N$ nodes and $M$ edges, and $G_{N,p}$, the ensemble of all graphs with $N$ nodes and probability $p$ of any two nodes being connected. These two families, initially studied by Erdős and Rényi, are known to be similar if $M = \\binom{N}{2} p$, so as long $p$ is not too close to $0$ or $1$ they are referred to as ER graphs. \n",
"\n",
"An important attribute of a graph is the average degree, i.e., the average number of edges connected to each node. We will denote the degree of the ith node by $k_i$ and the average degree by $ \\langle r \\rangle $ . $N$-vertex graphs with $\\langle k \\rangle = O(N^0)$ are called sparse graphs. \n",
"\n",
"An interesting characteristic of the ensemble $G_{N,p}$ is that many of its properties have a related threshold function, $p_t(N)$, such that the property exists, in the “thermodynamic limit” of $N \\to \\infty$ with probability 0 if $p < p_t$ , and with probability $1$ if $p > p_t$ . This phenomenon is the same as the physical concept of a percolation phase transition. \n",
"\n",
"Another property is the average path length between any two nodes, which in almost every graph of the ensemble (with $\\langle k \\rangle > 1$ and finite) is of order $\\ln N$ . The small, logarithmic distance is actually the origin of the “small-world” phenomena that characterize networks. -->\n",
"\n",
"There are two well-known ensembles of graphs that have been extensively studied: the ensemble of all graphs with $N$ nodes and $M$ edges, denoted $G_{N,M}$, and the ensemble of all graphs with $N$ nodes and a probability $p$ of any two nodes being connected, denoted $G_{N,p}$. These ensembles, initially studied by Erdős and Rényi, are similar when $M = \\binom{N}{2} p$, and are therefore referred to as ER graphs when $p$ is not too close to 0 or 1.\n",
"\n",
"An important feature of a graph is its average degree, or the average number of edges connected to each node. We will denote the degree of the $i$-th node by $k_i$ and the average degree by $\\langle r \\rangle$. Graphs with $N$ nodes and $\\langle k \\rangle = O(N^0)$ are called sparse graphs.\n",
"\n",
"One interesting property of the ensemble $G_{N,p}$ is that many of its characteristics have a corresponding threshold function, $p_t(N)$, such that the property exists with probability 0 if $p < p_t$ and with probability 1 if $p > p_t$ in the \"thermodynamic limit\" of $N \\to \\infty$. This is similar to the physical concept of a percolation phase transition.\n",
"\n",
"Another property of interest is the average path length between any two nodes, which is typically of order $\\ln N$ in almost every graph of the ensemble (with $\\langle k \\rangle > 1$ and finite). This small, logarithmic distance is the source of the \"small-world\" phenomena that are characteristic of networks.\n",
"\n",
"\n",
"## Scale-free networks\n",
"\n",
"<!-- The Erdős-Rényi model has traditionally been the dominant subject of study in the field of random graphs. Recently, however, several studies of real-world networks have found that the ER model fails to reproduce many of their observed properties. One of the simplest properties of a network that can be measured directly is the degree distribution, or the fraction P(k) of nodes having k connections (degree $k$). A well-known result for ER networks is that the degree distribution is Poissonian,\n",
"\n",
"\\begin{equation}\n",
" P(k) = \\frac{e^{z} z^k}{k!}\n",
"\\end{equation}\n",
"\n",
"Where $z = \\langle k \\rangle$. is the average degree. Direct measurements of the degree distribution for real networks show that the Poisson law does not apply. Rather, often these nets exhibit a scale-free degree distribution:\n",
"\n",
"\\begin{equation}\n",
" P(k) = ck^{-\\gamma} \\quad \\text{for} \\quad k = m, ... , K\n",
"\\end{equation}\n",
"\n",
"Where $c \\sim (\\gamma -1)m^{\\gamma - 1}$ is a normalization factor, and $m$ and $K$ are the lower and upper cutoffs for the degree of a node, respectively. The divergence of moments higher then $\\lceil \\gamma -1 \\rceil$ (as $K \\to \\infty$ when $N \\to \\infty$) is responsible for many of the anomalous properties attributed to scale-free networks. \n",
"\n",
"All real-world networks are finite and therefore all their moments are finite. The actual value of the cutoff K plays an important role. It may be approximated by noting that the total probability of nodes with $k > K$ is of order $1/N$\n",
"\n",
"\\begin{equation}\n",
" \\int_K^\\infty P(k) dk \\sim \\frac{1}{N}\n",
"\\end{equation}\n",
"\n",
"This yields the result\n",
"\n",
"\\begin{equation}\n",
" K \\sim m N^{1/(\\gamma -1)}\n",
"\\end{equation}\n",
"\n",
"The degree distribution alone is not enough to characterize the network. There are many other quantities, such as the degree-degree correlation (between connected nodes), the spatial correlations, the clustering coefficient, the betweenness or central-ity distribution, and the self-similarity exponents. -->\n",
"\n",
"The Erdős-Rényi model has long been the primary focus of research in the field of random graphs. However, recent studies of real-world networks have shown that the ER model does not accurately capture many of their observed properties. One such property that can be easily measured is the degree distribution, or the fraction $P(k)$ of nodes with $k$ connections (degree $k$). A well-known result for ER networks is that the degree distribution follows a Poisson distribution, given by\n",
"\n",
"\\begin{equation}\n",
"P(k) = \\frac{e^{z} z^k}{k!}\n",
"\\end{equation}\n",
"\n",
"where $z = \\langle k \\rangle$ is the average degree. However, measurements of the degree distribution for real networks often show that the Poisson law does not hold, instead exhibiting a scale-free degree distribution of the form\n",
"\n",
"\\begin{equation}\n",
"P(k) = ck^{-\\gamma} \\quad \\text{for} \\quad k = m, ... , K\n",
"\\end{equation}\n",
"\n",
"where $c \\sim (\\gamma -1)m^{\\gamma - 1}$ is a normalization factor, and $m$ and $K$ are the lower and upper cutoffs for the degree of a node, respectively. The divergence of moments higher than $\\lceil \\gamma -1 \\rceil$ (as $K \\to \\infty$ when $N \\to \\infty$) is responsible for many of the unusual properties attributed to scale-free networks.\n",
"\n",
"It is important to note that all real-world networks are finite, so all of their moments are finite as well. The actual value of the cutoff $K$ plays a significant role, and can be approximated by noting that the total probability of nodes with $k > K$ is approximately $1/N$, or\n",
"\n",
"\\begin{equation}\n",
"\\int_K^\\infty P(k) dk \\sim \\frac{1}{N}\n",
"\\end{equation}\n",
"\n",
"This gives the result\n",
"\n",
"\\begin{equation}\n",
"K \\sim m N^{1/(\\gamma -1)}\n",
"\\end{equation}\n",
"\n",
"The degree distribution is not the only characteristic that can be used to describe a network. Other quantities, such as the degree-degree correlation (between connected nodes), spatial correlations, clustering coefficient, betweenness or centrality distribution, and self-similarity exponents, can also provide insight into the network's structure and behavior.\n",
"\n",
"# Diameter and fractal dimension\n",
"\n",
"<!-- Regular lattices can be viewed as networks embedded in Euclidean space, of a well-defined dimension, $d$. This means that $n(r)$, the number of nodes within a distance $r$ from an origin, grows as $n(r) \\sim r^d$ (for large $r$). For fractal objects, $d$ in the last relation may be a non-integer and is replaced by the fractal dimension $d_f$ \n",
"\n",
"An example of a network where the above power laws are not valid is the Cayley tree (also known as the Bethe lattice). The Cayley tree is a regular graph, of fixed degree $z$, and no loops. An infinite Cayley tree cannot be embedded in a Euclidean space of finite dimensionality. The number of nodes at $l$ is $n(l) \\sim (z - 1)^l$ . Since the exponential growth is faster than any power law, Cayley trees are referred to as infinite-dimensional systems. \n",
"\n",
"In most random network models, the structure is locally tree-like (since most loops occur only for $n(l) \\sim N$), and since the number of nodes grows as $n(l) \\sim \\langle k - 1 \\rangle^l$, they are also infinite dimensional. As a consequence, the diameter of such graphs (i.e., the minimal path between the most distant nodes) scales as $D \\sim \\ln N$. Many properties of ER networks, including the logarithmic diameter, are also present in Cayley trees. This small diameter in ER graphs and Cayley trees is in contrast to that of finite-dimensional lattices, where $D \\sim N^{1/d_l}$. \n",
"\n",
"Similar to ER, percolation on infinite-dimensional lattices and the Cayley tree yields a critical threshold $p_c = 1/(z - 1)$. For $p > p_c$, a “giant cluster” of order $N$ exists, whereas for $p < pc$,only small clusters appear. For infinite-dimensional lattices (similar to ER networks) at criticality, $p = p_c$ , the giant component is of size $N^{2/3}$. This last result follows from the fact that percolation on lattices in dimension $d \\geq d_c = 6$ is in the same universality class as infinite-dimensional percolation, where the fractal dimension of the giant cluster is $d_f = 4$, and therefore the size of the giant cluster scales as $N^{d_f/d_c} = N^{2/3}$. The dimension $d_c$ is called the “upper critical dimension.” Such an upper critical dimension exists not only in percolation phenomena, but also in other physical models, such as in the self-avoiding walk model for polymers and in the Ising model for magnetism; in both these cases $d_c = 4$.\n",
"\n",
"Watts and Strogatz suggested a model that retains the local high clustering of lattices (i.e., the neighbors of a node have a much higher probability of being neighbors than in random graphs) while reducing the diameter to $D \\sim \\ln N$ . This so-called, “small-world network” is achieved by replacing a fraction $\\varphi$ of the links in a regular lattice with random links, to random distant neighbors. -->\n",
"\n",
"Regular lattices can be viewed as networks embedded in Euclidean space of a defined dimension $d$, meaning that $n(r)$, the number of nodes within a distance $r$ from an origin, grows as $n(r) \\sim r^d$ for large $r$. For fractal objects, the dimension $d$ in this relation may be a non-integer and is replaced by the fractal dimension $d_f$.\n",
"\n",
"One example of a network where these power laws do not hold is the Cayley tree, also known as the Bethe lattice, which is a regular graph of fixed degree $z$ with no loops. An infinite Cayley tree cannot be embedded in a Euclidean space of finite dimensionality. The number of nodes at level $l$ grows as $n(l) \\sim (z - 1)^l$, which is faster than any power law, making Cayley trees infinite-dimensional systems.\n",
"\n",
"Many random network models have locally tree-like structure (since most loops occur only when $n(l) \\sim N$), and since the number of nodes grows as $n(l) \\sim \\langle k - 1 \\rangle^l$, they are also infinite dimensional. As a result, the diameter of such graphs (i.e., the shortest path between the most distant nodes) scales as $D \\sim \\ln N$. Many properties of ER networks, including the logarithmic diameter, are also present in Cayley trees. This small diameter is in contrast to that of finite-dimensional lattices, where $D \\sim N^{1/d_l}$.\n",
"\n",
"Like ER networks, percolation on infinite-dimensional lattices and the Cayley tree exhibits a critical threshold $p_c = 1/(z - 1)$. For $p > p_c$, a \"giant cluster\" of size $N$ exists, while for $p < p_c$, only small clusters are present. At criticality ($p = p_c$) in infinite-dimensional lattices (similar to ER networks), the giant component is of size $N^{2/3}$. This result follows from the fact that percolation on lattices in dimension $d \\geq d_c = 6$ is in the same universality class as infinite-dimensional percolation, where the fractal dimension of the giant cluster is $d_f = 4$, resulting in a size of the giant cluster that scales as $N^{d_f/d_c} = N^{2/3}$. The dimension $d_c$ is known as the \"upper critical dimension,\" and this concept exists not only in percolation phenomena, but also in other physical models such as the self-avoiding walk model for polymers and the Ising model for magnetism, in both of which $d_c = 4$.\n",
"\n",
"Watts and Strogatz proposed a model that retains the high local clustering of lattices (i.e., the neighbors of a node have a higher probability of being neighbors than in random graphs) while reducing the diameter to $D \\sim \\ln N$. This so-called \"small-world network\" is achieved by replacing a fraction $\\varphi$ of the links in a regular lattice with random links to random, distant neighbors.\n",
"\n",
"\n",
"\n",
"## Random graphs as a model of real networks\n",
"\n",
"<!-- Many natural and man-made systems are networks, i.e., they consist of objects and interactions between them. These include computer networks, in particular the Internet, logical networks, such as links between WWW pages, and email networks, where a link represents the presence of a person's address in another person's address book. Social interactions in populations, work relations, etc. can also be modeled by a network structure. Networks can also describe possible actions or movements of a system in a configuration space (a phase space), and the nearest configurations are connected by a link. All the above examples and many others have a graph structure that can be studied. Many of them have some ordered structure, derived from geographical or geometrical considerations, cluster and group formation, or other specific properties. However, most of the above networks are far from regular lattices and are much more complex and random in structure. Therefore, it can be assumed (with a lot of precaution) that they maintain many properties of the appropriate random graph model. \n",
"\n",
"In many aspects scale-free networks can be regarded as a generalization of ER networks. For large $\\gamma$ (usually, for $\\gamma > 4$) the properties of scale-free networks, such as distances, optimal paths, and percolation, are the same as in ER networks. In contrast, for $\\gamma < 4$, these properties are very different and can be regarded as anomalous. The anomalous behavior of scale-free networks is due to the strong heterogeneity in the degree of the nodes, which breaks the node-to-node translational homogeneity (symmetry) that exists in the classical\n",
"homogeneous networks, such as lattices, Cayley trees, and ER graphs. The small variation of the degrees in the ER model or in scale-free networks with large $gamma$ is insufficient to break this symmetry, and therefore many results for ER networks are the same as for Cayley trees, where the degree of each node is the same. -->\n",
"\n",
"Many natural and man-made systems can be represented as networks, consisting of objects and interactions between them. Examples include computer networks, particularly the Internet, logical networks such as links between web pages and email networks, where a link represents the presence of an individual's address in another person's address book, and social interactions in populations or work relations. Networks can also describe possible actions or movements of a system in a configuration space (a phase space), with nearest configurations connected by a link. All of these examples and many others have a graph structure that can be studied. Many of these networks have some ordered structure, derived from geographical or geometrical considerations, cluster and group formation, or other specific properties, but most of them are far from regular lattices and are much more complex and random in structure. Therefore, it is often assumed (with caution) that they share many properties with the appropriate random graph model.\n",
"\n",
"In many ways, scale-free networks can be considered a generalization of ER networks. For large $\\gamma$ (typically $\\gamma > 4$), the properties of scale-free networks such as distances, optimal paths, and percolation are the same as in ER networks. In contrast, for $\\gamma < 4$, these properties are very different and can be considered anomalous. The anomalous behavior of scale-free networks is due to the strong heterogeneity in the degrees of the nodes, which breaks the node-to-node translational homogeneity (symmetry) present in\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Discovering the datasets"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"To perform our analysis, we will use the following datasets:\n",
"\n",
"- **Brightkite**\n",
"- **Gowalla**\n",
"- **Foursquare**\n",
"\n",
"We can download the datasets using the function `download_dataset` from the `utils` module. It will download the datasets in the `data` folder, organized in sub-folders in the following way:\n",
"\n",
"```\n",
"data\n",
"├── brightkite\n",
"│ ├── brightkite_checkins.txt\n",
"│ └── brightkite_friends_edges.txt\n",
"├── foursquare\n",
"│ ├── foursquare_checkins.txt\n",
"│ ├── foursquare_friends_edges.txt\n",
"│ └── raw_POIs.txt\n",
"└── gowalla\n",
" ├── gowalla_checkins.txt\n",
" └── gowalla_friends_edges.txt\n",
"```\n",
"\n",
"If any of the datasets is already downloaded, it will not be downloaded again. For further details about the function below, please refer to the `utils` module.\n",
"\n",
"> NOTE: the Stanford servers tends to be slow, so it may take a while to download the datasets. It's gonna take about 5 minutes to download all the datasets.\n",
"\n",
"---\n",
"\n",
"### A deeper look at the datasets\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"download_datasets()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's have a deeper look at them.\n",
"\n",
"## Brightkite\n",
"\n",
"[Brightkite](http://www.brightkite.com/) was a location-based social networking service that allowed users to share their locations by checking in. The friendship network data was collected using the Brightkite public API. There are two datasets available for analysis: \n",
"\n",
"- `brightkite_checkins.txt`, which contains check-in data in the form of a tab-separated file with five columns: `user id`, `check-in time`, `latitude`, `longitude`, and `location id`\n",
" \n",
"- `brightkite_friends_edges.txt`, which is a tab-separated file with two columns containing user IDs and representing the friendship network in the form of a graph edge list. \n",
"\n",
"The `brightkite_checkins.txt` dataset must be converted into a graph before it can be analyzed properly, while the `brightkite_friends_edges.txt` dataset is already in a usable form for graph analysis.\n",
"\n",
"Let's have a more clear view of where our data have been generated"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of unique users: 35538\n"
]
},
{
"data": {
"text/plain": [
"<AxesSubplot: >"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAGdCAYAAADdSjBDAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABGt0lEQVR4nO3df3AUZZ4/8M8kJAOJJM4QNpPRZMLVreftxoWVRE2u1og/QEpwPbcMJFV3ULV6F9aIaKw7cD1B9iJxFbw9cTN1u5S4nvzwTvHuSncP3EU9DtSA8N2ge6WskKAQWSFmIEASk+f7R+8z093p7ume6Zl+uuf9quqaZKZn5ume7v7089vHGGMEAAAAQspzOgEAAACgD4EaAABAYAjUAAAAAkOgBgAAEBgCNQAAgMAQqAEAAASGQA0AACAwBGoAAACBTXI6AdkwPj5OJ06coKlTp5LP53M6OQAAAMQYo7Nnz1I4HKa8PP18c04E6hMnTlBlZaXTyQAAAJjg+PHjdPnll+u+nhOBeurUqUQk7YySkhKHUwMAAEAUi8WosrIyHqP05ESg5sXdJSUlCNQAACCUZFWyaEwGAAAgMARqAAAAgSFQAwAACAyBGgAAQGAI1AAAAAJDoAYAABAYAjUAAIDAEKgBAAAEhkANAAAgMARqAAAAgSFQAwAACCyjgfrtt9+mhQsXUjgcJp/PR6+++qri9aVLl5LP51Ms1113nWKd4eFhuu+++6isrIyKi4vp9ttvp08//TSTyQYAABBGRgP10NAQzZw5kzZu3Ki7zq233konT56ML6+//rri9RUrVtCOHTto27ZttGfPHjp37hwtWLCAxsbGMpl0ACHk5xP5fNIjAOSmjM6eNX/+fJo/f77hOn6/n0KhkOZrg4ODtGnTJnrhhRfo5ptvJiKif/3Xf6XKykp64403aN68ebanGUAk4+PKRwDIPY7XUb/55pv0ta99ja644gq655576NSpU/HXDhw4QKOjozR37tz4c+FwmGpqamjv3r26nzk8PEyxWEyxAAAAuJGjgXr+/Pn04osv0m9+8xtav349dXd304033kjDw8NERNTf30+FhYUUCAQU7ysvL6f+/n7dz123bh2VlpbGl8rKyoxuB0CmFBUpHwEg92S06DuZRYsWxf+uqamh2tpaikQi9Nprr9Gdd96p+z7GmOFE26tWraIHH3ww/n8sFkOwBlcaGnI6BQDgNMeLvuUqKiooEonQxx9/TEREoVCIRkZGaGBgQLHeqVOnqLy8XPdz/H4/lZSUKBYAAAA3EipQnz59mo4fP04VFRVERDR79mwqKCigXbt2xdc5efIkHT58mBoaGpxKJgAAQNZktOj73LlzdOTIkfj/R48epUOHDlEwGKRgMEhr1qyh733ve1RRUUHHjh2jhx9+mMrKyugv//IviYiotLSUvv/971N7eztNmzaNgsEgPfTQQ3TVVVfFW4EDAAB4WUYD9f79+2nOnDnx/3m98ZIlS6irq4t6enroF7/4BX355ZdUUVFBc+bMoe3bt9PUqVPj73n66adp0qRJ1NTURBcuXKCbbrqJNm/eTPnoWAoAADnAxxhjTici02KxGJWWltLg4CDqqwEAQAhmY5NQddQAAACghEANAAAgMARqAAAAgSFQAwAACAyBGgAAQGAI1AAAAAJDoAYAABAYAjUAAIDAEKgBAAAEhkANAAAgMARqAAAAgSFQAwAACAyBGgAAQGAI1AAAAAJDoAYAABAYAjUAAIDAEKgBAAAEhkANAAAgMARqAAAAgSFQAwAACAyBGgAAQGAI1AAAAAJDoAYAABAYAjUAAIDAEKgBAAAEhkANAAAgMARqAAAAgSFQAwAACAyBGgAAQGAI1AAAAAJDoAYAABAYAjUAAIDAEKgBAAAEhkANAAAgMARqAAAAgSFQAwAACAyBGgAAQGAI1AAAAAJDoAYAABAYAjUAAIDAEKgBAAAEhkANAAAgsIwG6rfffpsWLlxI4XCYfD4fvfrqq4rXGWO0Zs0aCofDNGXKFLrhhhvogw8+UKwzPDxM9913H5WVlVFxcTHdfvvt9Omnn2Yy2QAAAMLIaKAeGhqimTNn0saNGzVf//GPf0wbNmygjRs3Und3N4VCIbrlllvo7Nmz8XVWrFhBO3bsoG3bttGePXvo3LlztGDBAhobG8tk0gEAAITgY4yxrHyRz0c7duygO+64g4ik3HQ4HKYVK1bQ3//93xORlHsuLy+nJ554gv72b/+WBgcHafr06fTCCy/QokWLiIjoxIkTVFlZSa+//jrNmzfP1HfHYjEqLS2lwcFBKikpycj2AQAAWGE2NjlWR3306FHq7++nuXPnxp/z+/3U2NhIe/fuJSKiAwcO0OjoqGKdcDhMNTU18XW0DA8PUywWUywAAABu5Fig7u/vJyKi8vJyxfPl5eXx1/r7+6mwsJACgYDuOlrWrVtHpaWl8aWystLm1AMAAGSH462+fT6f4n/G2ITn1JKts2rVKhocHIwvx48ftyWtAAAA2eZYoA6FQkREE3LGp06diueyQ6EQjYyM0MDAgO46Wvx+P5WUlCgWAAAAN3IsUM+YMYNCoRDt2rUr/tzIyAi99dZb1NDQQEREs2fPpoKCAsU6J0+epMOHD8fXAQAA8LJJmfzwc+fO0ZEjR+L/Hz16lA4dOkTBYJCqqqpoxYoV9Pjjj9PXv/51+vrXv06PP/44FRUVUUtLCxERlZaW0ve//31qb2+nadOmUTAYpIceeoiuuuoquvnmmzOZdAAAACFkNFDv37+f5syZE///wQcfJCKiJUuW0ObNm+nv/u7v6MKFC/SDH/yABgYG6Nprr6WdO3fS1KlT4+95+umnadKkSdTU1EQXLlygm266iTZv3kz5+fmZTDoAAIAQstaP2knoRw0AAKIRvh81AAAAJIdADQAAIDAEagAAAIEhUAMAAJjU0kI0aZL0mC0I1AAAACa99BLR2Jj0mC0I1AAAACY1NRHl50uP2YLuWQAAAA5A9ywAAAAPQKAGAAAQGAI1AACAwBCoAQAABIZADQAAIDAEagAAAIEhUAMAWODEyFSQ2xCoAcC0aJSoulp6zFVOjEwFuQ2BGgBM6+wk6u2VHlNVWEjk80mPbuTEyFSQ2xCoAcC0lSuJIhHpMVWjo8pHt9myheirr6RHgGzAEKIAkFWFhVKQLiggGhlxOjUAzjEbmyZlMU0AAAjOABah6BsAAEBgCNQAAAACQ6AGAAAQGAI1gACCQanLUjDodEoAQDQI1ALy+RIL5IaBAeUjSCIR5fkQiWDAFVDKhUF4EKhdSH7xQg7MGwIB5WMuiUaJ8vKUAZkvfX3Kdfv6pAFXli0jmjbN2xdnMMeOQXhEh0DtQvKLF3Jg3nDmDBFj0qPXqXPJy5ZJ227VmTPSezHudm6zYxAe0SFQC4ixxKKlqirxdyCQG0U/4G7y4KzOJacL427nttZWomPHpEevwoAnLtTbq/y/ujpR9OPlgxXcy+7gLIdxt8HrkKP2gFwo+gFz1BNe8Jxsfn6iMZabdXUpS5wYc+e42ygFAysQqD2gtZWooYGorQ11dblOPuFFNJrIyY6PS499fdmfT9nO3gvLlkmfl+fyK1cuNIAC+7j8cHcfO+6kW1omto7FHLmgtmyZ9vPZOlai0cx1MTTb+Ex9vtXVSWkqLpYe6+oyk75kUAoGVmD2rCzj9cmRiNQAgki6iHR2SietvI5ZfpFjTFpv+XLz0wN6/5cFtWSBsaiI6LvflYJ0U1NmioyjUaJ7703k4jPFzPGtPt+09g/OE3CK2diEHHWWad1JaxWDaeW4OzvFncMXdW7uMDSUufmU5d2tMh2k+fcloz7famulx6Ii5f8AIkOOWgBaOWqeE+AYk4rp9u83/7nZ/GW1Sgog+7SCl89nLXCqS3LM4HNMZ1u6xzg/p2pribq77UmT6PLypP1m9bgA+yFHLTieA62rkxqBNTQoi715ToC3ciUyF6TlrWLl38NzupkaUxp1bmJQt4hmzL6LsV6piVNB2g78nNq/X9quTJYMZbsRnx5+bRAhi4aSOHOQo3aIOsecny8VR+ox2yhH/Ws
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"df_brighkite = pd.read_csv(os.path.join('test_data', 'brightkite', 'brightkite_checkins_full.txt'), \n",
" sep='\\t', \n",
" header=None,\n",
" names=['user id', 'check-in time', 'latitude', 'longitude', 'location id'],\n",
" parse_dates=['check-in time'],\n",
" engine='pyarrow')\n",
"\n",
"# take only the dates from 2009\n",
"df_brighkite = df_brighkite[df_brighkite['check-in time'].dt.year == 2009]\n",
"\n",
"# convert the dataframe to geopandas dataframe\n",
"gdf_brightkite = gpd.GeoDataFrame(df_brighkite, geometry=gpd.points_from_xy(df_brighkite.longitude, df_brighkite.latitude))\n",
"\n",
"# plot the geopandas dataframe\n",
"print(\"Number of unique users: \", len(df_brighkite['user id'].unique()))\n",
"gdf_brightkite.plot(marker='o', color='blue', markersize=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Familiar shape, isn't it? As we can see there are ~35k, a bit too much for our future computation. Let's take a subset, like Europe!"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of unique users in Europe: 8525\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAFiCAYAAABWCbpTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAACBsUlEQVR4nO2df7Bd1XXf172P+578hPTyHj/0A3hPTAzJBAJxfFVbNIE0sYWd2KahMfA0LvIUmIoa1zT6w7JpRtCp0kc6uJMM5r62znjIJJJgjGmdFKehMwgSq9Tih8cvpNMhRr+KwBiQpYfEk0Da/eN0vbvuemvvs/f5fe5dn5kzT7r33HP2OWefvb577bXXbhhjDCiKoiiKohREs+wCKIqiKIoyWKj4UBRFURSlUFR8KIqiKIpSKCo+FEVRFEUpFBUfiqIoiqIUiooPRVEURVEKRcWHoiiKoiiFouJDURRFUZRCOafsAnDOnj0LR44cgRUrVkCj0Si7OIqiKIqieGCMgfn5eVi7di00m27fRuXEx5EjR+CSSy4puxiKoiiKoiTg8OHDcPHFFzv3qZz4WLFiBQBEhV+5cmXJpVEURVEUxYfjx4/DJZdcsmjHXVROfOBQy8qVK1V8KIqiKErN8AmZ0IBTRVEURVEKRcWHoiiKoiiFouJDURRFUZRCUfGhKIqiKEqhqPhQFEVRFKVQVHwoiqIoilIoKj4URVEURSmUYPHx6quvwuc+9zk477zzYHR0FH7pl34Jnn/++cXvjTFw7733wtq1a+EDH/gA/Nqv/Rq89NJLmRZaURRFUZT6EiQ+jh49Cv/wH/5DaLVa8N3vfhf+7u/+Dh544AH4mZ/5mcV9/uAP/gC+9rWvwYMPPgj79u2D1atXw8c//nGYn5/PuuyKoiiKotSQhjHG+O68bds2+N73vgd//dd/LX5vjIG1a9fC3XffDV/+8pcBAODUqVOwatUquP/+++Gf//N/HnuO48ePw9jYGBw7dkwznCqKoihKTQix30Gej+985zvQbrfhs5/9LFx44YXwoQ99CP7zf/7Pi9/v378fXn/9ddi4cePiZyMjI3DdddfB3r17xWOeOnUKjh8/3rMpihLP7CzAunXR3zqA5V2/HuCccwA2bYo2/Hej4bc1m9H+dWd2NrqWRgNgYiL98aamuvfnvPPs9YLe8/Xru/c17p7S32VZ9+hx0yLVMRt4v6am0p+3LlSqzTABjIyMmJGREfOVr3zFvPDCC2Z2dtYsW7bMPPzww8YYY773ve8ZADCvvvpqz+/uuOMOs3HjRvGY27dvNwCwZDt27FhI0RSl7+h0jJmaiv7SfxtjzOSkMQDRNjW19Df4XaORT9na7ej4k5O95XJBy5V2Gxpyn4vuWzTT01H5pqeXfof3Tdrw2U1PGzMxsfQa6Pf0b6cjH6/V6i0HlqvRsJfB9SyHhrr3nj5L6TpDoMdNC69jrmNKdcT17PoBvD+0zciSY8eOedvvoFez1WqZDRs29Hz2xS9+0Xz0ox81xnTFx5EjR3r2uf322831118vHnNhYcEcO3ZscTt8+LCKD0UxsrHGRoMbLddvACKjh1BjlUScSAaUN2aSOMGGfXKya3CSbI1GvHHwFR8o4iYnw+4BF4MUmzG1iQQAY8bHu89OujfGLP0e/8aJOiwHPa5LgLRa8jVTw0yvJa1oSGrwsY4NDy8VZO12/DGlZ5+lEKoirnqbBbmJj8nJSXPbbbf1fPbQQw+ZtWvXGmOM+dGPfmQAwLzwwgs9+3zmM58xt956q9c5QgqvKP0MN1a0McWGc3S093Pu+ZCMsPQ9GnQuFNrtpY0V/Z3N8+Eyhmk8ID6Npu264+5vHLThpkKAl2l6Orqfo6PxwpDeOzzu+Lh8DVTQTUwY02x2haXrnlHPh+999iGpaGi13CLHF/5u4P1MSqcTPTMfcavI5CY+pqenza/8yq/0fHb33XcvekPOnj1rVq9ebe6///7F70+dOmXGxsbM7Oys1zlUfChKF2yocZuYiBpIqXcrGQ/pM1sPHHvDKGi4kUTQ2KE3pdOJyjUx0TWkccMLoaKDCxxXD46fixtzhIuBOKjLutNxGzxJnPBhE349KCay3mz3xrXlOfQQKnJsSJ6PpD36uOfZb+TlAclNfHz/+98355xzjtmxY4d5+eWXzZ/92Z+Z0dFR86d/+qeL+8zMzJixsTHz7W9/28zNzZnp6WmzZs0ac/z48cwLryiDAjYWPBaAGgpf8WHMUoPcaHTFR7PZ/X50NL6RsnkyJJGBgiWNMUWhYzMU/JptQqbTiQwXQHS9cQaXN9hSA45CjIpGlzHLQ2xkteU19ID3ptnMdwggBJcnqx/JK/YjN/FhjDF//ud/bq688kozMjJifv7nf978p//0n3q+P3v2rNm+fbtZvXq1GRkZMddee62Zm5vLpfCKMgjwHp7Us7UNIfj0gLEBom50yVgj3G3e6Sz10ODvpHJOTLhjDuLEAxUg1NtiQxrGwGvmYq7ZDHs2HCng0VW+sgWGdL+LCrrMO/gxDuqJku4FD9jtJ2rn+SgCFR+KYh+amJrqDruMjnb3lz5zgQa50YgfuuAGIk7k4CYdV/Lc2LbxcfnYKDgk4xXXqPLvXeWRYjlowCz+pcYJhWKz2RuTwcsQKgpscSC2e59ky2tmlI28gx/jcAX40i3UA0TFTGggc91R8aEoNUMy9q7eKQcNXdqeO+IKOJUCBqVeI0KNTIj4QKFB/08NvWS84nrTtFcfF6jJj+GKycDyxF0PxibQz/C4vt6gPLdBgg4tuoQH93zETTPn9WSQUPGhKDWDN3ouQ2bM0sj8svMTSIZTCpZ1zcbx2eJ659J9oMNWIUKHGxWXkTLG/7pojpZmc2kAr/QbOhVXxUc2+NyPUK+gMb31RD0fdnRVW0WpIFu22L9bvx7gC18AOHkyauIefRRg506A99+P/obgyi4Zkg3x7FmAycnu/9ttgPfe693npz+N/h440LsvpdUCGB2NMk9KGNPNyCmxdy/AmTPRX+S556K/p0/Lv5mcjLJc7tgB0OlE/+50lj6DBx6IMpFOTETfc7Zti8ofx6FD0d+JiaisuC4n/p2eBhgaiu4hZj49erT7OyV7sD6221Edw6ynMzNL9223u7+ZmoqeO+WBB7p16ODB/MpcewoQQ0Go50NRInAsutFwj0ujhwFjJIzxH093JVXCnvbERLKxeWmYIq73TknSS5euGz0LkncmDXzYBu+/a/hkaEgOzqUbekDi9stq64eppUk8fzQ2g98D2/vDvVTGZJe3pB/QYRclNWW78RV7Rkkfg0xjH1zP0vUdn94bZ6R4g51kmIASl/fCFVQqzYThAi6NS9xVJi5K8DowGNhn6El63lnHhPhMo6b3tCrTYiWoiA5pu0KvSxKJWYnZfkDFh5Kafk8zXDd8jInN80GNbhIx6dNAS0maMLbBFa/QatmPH5eR0yaG6D7Ua5NUUEu/cxl0yUBJSDNYaLm554PGiqTZfKcoU8qeFhsHfUau9PZpE5HxZ2KMej4oKj6U1Kjno1rEGRSXkaNGPEsxOT0d9epbrfheuTG9LmtfQyB5P+J67NzQpjGaksHnng2atVS6dh+PE4qWPIZaMN07F6ghVN3zQbG1XWkFFP4eA5dDFlUcFFR8KJlSp4anX3H1Yn3SjKNQ4OuNcEJEZ1x+BGl8PBSc1SOJGRt02nGaXqnL84KgUcdZOHg+em9cC8zhd1kLDlomJSILz4c0rEjX+xn0NlLFh5IYKVlV1V2ugwDvVTca8fEY3Fj6PEduTF3QhFouA50GmwDwbeht5aGp1YeH5ZTpLnFFr59+Rqfrjo/HLzDnk2/EJTzp+SYmuvdsULyWktEv6vrpufGZ4jIFcflo+hUVH0pieGNqC95TisXmAcAtzvOBx4hrBH0zpUpejjwafJcAQAHmCjyli+VRpCES+jkOpVBvEV2CXZrpQoc3qEjigi8LrwYVNINk3DiSuEwTr0aH2ULqsiRWeRZdFIf9jIoPJTHc+NDeoVI+th65ixBvhG9wqeSJyAMaQGjbhoa63hc6vOJq9ONmD+H1c4NP/1LhZYx
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"gdf_brightkite = gdf_brightkite[gdf_brightkite['latitude'] < 60]\n",
"gdf_brightkite = gdf_brightkite[gdf_brightkite['latitude'] > 35]\n",
"gdf_brightkite = gdf_brightkite[gdf_brightkite['longitude'] < 30]\n",
"gdf_brightkite = gdf_brightkite[gdf_brightkite['longitude'] > -10]\n",
"\n",
"gdf_brightkite.plot(marker='o', color='blue', markersize=1)\n",
"\n",
"# update the pandas dataframe with the new values\n",
"df_brighkite = gdf_brightkite\n",
"print(\"Number of unique users in Europe: \", len(df_brighkite['user id'].unique()))\n",
"\n",
"# remove from memory the geopandas dataframe, it was only used for plotting\n",
"del gdf_brightkite"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Perfect! Now we can create a new .txt file, only with the information that we need"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"# update the file with the new values. Drop the columns that are not needed\n",
"df_brighkite.to_csv(\n",
" os.path.join('data', 'brightkite', 'brightkite_checkins.txt'), \n",
" sep='\\t', \n",
" header=False, \n",
" index=False, \n",
" columns=['user id', 'location id'])\n",
"\n",
"# I prefer not to delete the full dataset, since it's bad practice in my opinion. If you want to delete it, uncomment the following line\n",
"\n",
"# os.remove(os.path.join('data', 'brightkite', 'brightkite_checkins_full.txt'))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Gowalla\n",
"\n",
"Gowalla is a location-based social networking website where users share their locations by checking-in. The friendship network is undirected and was collected using their public API. As for Brightkite, we will work with two different datasets. This is how they look like after being filtered by the `download_dataset` function:\n",
"\n",
"- `data/gowalla/gowalla_checkins.txt`: the checkins, a tsv file with 5 columns: `user id`, `check-in time`, `latitude`, `longitude`, `location id`\n",
"\n",
"- `data/gowalla/gowalla_friends_edges.txt`: the friendship network, a tsv file with 2 columns of users ids. This file it's untouched by the function, it's in the form of a graph edge list. \n",
"\n",
"--- \n",
"\n",
"Let's have a more clear view of where our data have been generated"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of unique users: 12611\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAisAAADaCAYAAABn/wK9AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAwu0lEQVR4nO3df3AT550/8Lcs2/IPwLExWFbAMpcO7V1JQ8EcgV7ipr2QZAgNzU0A+x8zlzJjDh9N4k4Hjmb4kfPhND+afsNh3TU5LpmO+XFzpHe9JD3gJoTkgNY4nolD2wkcYDuAS8BgO5DYYD3fP7aPtFqt5JWs1a5W79eMRpa0Wj1ar3Y/+/z4PC4hhAARERGRTeVYXQAiIiKieBisEBERka0xWCEiIiJbY7BCREREtsZghYiIiGyNwQoRERHZGoMVIiIisjUGK0RERGRrDFaIiIjI1hisEBERka2ZGqxUV1fD5XJF3datWwcAWL16ddRrd999t5lFIiIiogyTa+bKOzo6MDY2Fnr80Ucf4f7778djjz0Weu7BBx/Erl27Qo/z8/PNLBIRERFlGFODlWnTpkU8bm1txR133IHa2trQcx6PB16vN+nPCAaDuHDhAiZPngyXy5X0eoiIiCh9hBAYHh6Gz+dDTk78hh5TgxW10dFR/PznP8dTTz0VEVQcPnwY06dPx2233Yba2lq0tLRg+vTpMdczMjKCkZGR0OPz58/jz/7sz0wtOxEREZmjr68PM2bMiLuMSwgh0lGYffv2ob6+Hr29vfD5fACAvXv3YtKkSfD7/Th79iyefvpp3Lp1C52dnfB4PLrr2bJlC7Zu3Rr1fF9fH6ZMmWLqdyAiIqLUGBoawsyZM3Ht2jWUlJTEXTZtwcoDDzyA/Px8/PKXv4y5zMWLF+H3+7Fnzx48+uijustoa1bklx0cHGSwQkRElCGGhoZQUlJi6Pydlmagnp4eHDp0CPv374+7XGVlJfx+P06dOhVzGY/HE7PWhYiIiJwnLXlWdu3ahenTp2Pp0qVxl7ty5Qr6+vpQWVmZjmIRERFRBjA9WAkGg9i1axcaGhqQmxuuyPnss8/wgx/8AMeOHcO5c+dw+PBhLFu2DOXl5fjud79rdrGIiIgoQ5gerBw6dAi9vb3467/+64jn3W43uru78cgjj2D27NloaGjA7NmzcezYMUyePNnsYhERxRcIANXVyj0RWSptHWzNkkgHHSIiw6qrgZ4ewO8Hzp2zujREjpPI+ZtzAxER6dmwQQlUNmywuiREWY81K0SU+dxuIBgEcnIA1RQfRGRfrFkhouxRX68EKkD4XqusDHC5lNuCBekrGxGlBIOVePx+5eDm91tdEiKKZd++8N+x5he5ejX894kTSqfZ4mKlRqa+3tzyEdGEMViJp7c38p6I7GfFCiXoqKuL3QRUWhr5eO1a4MYNpSZm9+5wrQtH/hDZEoOVeKqqIu+JyH7a24Fbt5R7IDzkuL4+PPR4YACoqRl/Xa2tZpaUiJKUtlmXM1JPT+rWpZppGlVVqV03ESlBSWsrMDysBCeffKLUtDQ1KTUpRnDkD5EtsWbFCmxWIrtxQgK0TZuUi4DhYaXvirwZHR0kBNDYaG4ZiSgpDFaswGYlspvWVuVEnwnNIDk5Sk1lTg6Qnx/ubzIwoLx+86bSF0Xe63G7leBENg2pm4jq64HcXHa8JbIRBivpIkT4lilNQPIkoG7CImeycwI0ba2PTA0lhBKQJGPFCuW+o0NZT0dH+LV9+5TaGPUoIyKyFIMVsxUXKyf74mKrS0IUW2OjklLejs0g2lofGTwnE0SXlirBSXt7ZDAuA6FAIDz8+etfn3jZiSglGKyYyeVShkcCyj2rlYkSJ2t9pk1TmmdWrVICjp07jb0/JydcqymbirRaW5VAZd26cG3Nhx+mpvxENGFMt28m7ZWf260MsSSi+OrrlWaYFSvCQ5Jzc5XmmZwc5beUaBOQ262sb/du4+/J7MMjka0x3b5dyXZyIopP3W/E7VYC/7Ex5V52nk2U0X4oshmopob9tohsgsGKmdSdamU7ORGNTwb2Y2ORI3omOpJuvAuGmhrlM7WdbonIUgxWMo0T8mFQ9liwILnJA2MF9hMZSVdVpaxXfQGhdeIEa1GIbIjBSqbJpHwYlN0CAeXkD4QnD5w6VblZEWxrA538/PHfEy+wIaK0YbCSaazMh8H2e0qENqBeu1YZjTMwYCzYNjtQUPd7KSoy5zOIKCUYrGQaO+fDIFKLF1AnGmwLYW7m5xdeSD44UgfxTE9AZAoOXSbj1DUqmb3bkNnq640NEU5kP0qmRi/e+tXrk8Oak+kEr15PInMREWU5Dl3ORuloomH7PRm1d6+x5RLZXxOtWREi/u9C7stud+rS6xcUTHwdRBSFwQrFx34qZJS6A63bbfx9sfatQCBy/zM6W3lVlX5AHWsU3YoV4ZqVZAgBtLUpfcleeCG5dRBRXKYGK1u2bIHL5Yq4eb3e0OtCCGzZsgU+nw+FhYX45je/iZMnT5pZJJoI7fxGHEZNaps2hTvQ5uUl9l71HD1yJuW1axMvQ1FR7OHNsUbRtbcrmaUnkgeJfcmITGV6zcpXv/pVXLx4MXTr7u4OvfbjH/8YL774Inbs2IGOjg54vV7cf//9GB4eNrtYzmNGE402CJHzHElyGPXatax9yXaBAHDt2sTX09qa/EzKgLKPqodGy99EW5vyuKeH+ylRBjI9WMnNzYXX6w3dpk2bBkCpVXnppZewadMmPProo5gzZw5ee+013LhxA+3M9GotWWOivbLVDu+Uw6iJWlsjM80WFCQXOE8k6Zs0MBAdQGtrPOxaI1hfr8yBxFFFRBFMD1ZOnToFn8+HWbNmYdWqVThz5gwA4OzZs+jv78eSJUtCy3o8HtTW1uLo0aNmF4uA2NlFZY2J1vXrkY9l1TeRDFzr6pT7lhbleTOHG0+EXRMrqudEIqIQU4OVhQsX4vXXX8d///d/42c/+xn6+/uxePFiXLlyBf39/QCAioqKiPdUVFSEXtMzMjKCoaGhiBvFoQ5I5IRwsvOjOruoywWUlSn3n34avR63W7na8/uVZdQ1Ksk0QbHjrrPIwLW9PbLvxvTpVpYqTB2Ql5Zal1hxPBPt7EvkUGnNs3L9+nXccccd+OEPf4i7774b3/jGN3DhwgVUVlaGllmzZg36+vrwq1/9SncdW7ZswdatW6OeZ56VGOIFA1VVxkdYAOEhnlKyu04gENnExKHQzmVlMFpaCoyMRPe1ApQJCzlRIZGlbJtnpbi4GHfeeSdOnToVGhWkrUW5dOlSVG2L2saNGzE4OBi69fX1mVrmjFdTE/u1eIFKjs6uMTYW+by6XV3d1q73t9+v3C9YADQ16X9mKkcXcaSSPaj3P9lElC5Xr+oHKkC4VpGIMoNIoy+++ELcfvvtYuvWrSIYDAqv1yueffbZ0OsjIyOipKREBAIBw+scHBwUAMTg4KAZRXaOoiJ1Y83Ebm53+F5SP6f3d6xbVZUQbW1C+P1ClJUpz/n9E/++fr/xdcnPb2ub+OdSNPX+0NYWuU+Ulob/7+m81dSEy8f/P5ElEjl/mxqsNDc3i8OHD4szZ86I48ePi4cfflhMnjxZnDt3TgghRGtrqygpKRH79+8X3d3doq6uTlRWVoqhoSHDn8FgxaCJHtxzcsLBRV2dcsKpqwuvX/2c9m/tSUIGEtrgx+VSTlylpdGfrf2M8Rg9Ael9zkTXSZG0/7e2NuV/DQiRl6e8JveLtrbwdjYjSFGrqVGey89PXZBMRIbZJlhZuXKlqKysFHl5ecLn84lHH31UnDx5MvR6MBgUmzdvFl6vV3g8HnHvvfeK7u7uhD6DwYpByRzY5QlFXYOSDL2amLw8/c+MdZKSwZIsVzzxgpq2NmMnM23AFO+kR4nRBrBy31D/38bb/kZvVVWRj9X7hHa/YxBKlFa2CVbSgcGKQYke5Nvaok/68kpUXYVuhF7wIK+eYwUt8W75+fFrOdTNDPJ1I+stLQ2vM5ErdEqMtmlQHajIx6kIVMb73yW7P8eSSM0fESV0/uasy9ki0VEZfr8yhPnGDSU
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"df_gowalla = pd.read_csv(os.path.join('test_data', 'gowalla', 'gowalla_checkins_full.txt'),\n",
" sep='\\t', \n",
" header=None,\n",
" names=['user id', 'check-in time', 'latitude', 'longitude', 'location id'],\n",
" parse_dates=['check-in time'],\n",
" engine='pyarrow')\n",
"\n",
"# take only the dates from 2009\n",
"df_gowalla = df_gowalla[df_gowalla['check-in time'].dt.year == 2009]\n",
"\n",
"# convert the dataframe to geopandas dataframe\n",
"gdf_gowalla = gpd.GeoDataFrame(df_gowalla, geometry=gpd.points_from_xy(df_gowalla.longitude, df_gowalla.latitude))\n",
"\n",
"# plot the geopandas dataframe\n",
"gdf_gowalla.plot(marker='o', color='red', markersize=1)\n",
"print(\"Number of unique users: \", len(df_gowalla['user id'].unique()))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This is still a bit too much, to help us in the next sections, let's take a subset of the European area"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of unique users in the UE area: 3718\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAFjCAYAAACdVWn2AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABUTUlEQVR4nO3dfXBc1Xk/8Gd39WKvkBWtDZZUrLWT4KSJIQFWDXYaoBAMlFIKJbJXace0DVMZHHBjpoi4GTu/ibCSGdPAOGibl8kkk/htQmDSCSRxJ7FJYkhkYyaqkybuYEsq2DGxjSRLWLal8/vj5Oiee/bct927e+/e/X5mdlbal7tnX+9zn3POc2KMMUYAAAAAZRIPugEAAABQXRB8AAAAQFkh+AAAAICyQvABAAAAZYXgAwAAAMoKwQcAAACUFYIPAAAAKKuaoBugmpmZoTfeeIMaGxspFosF3RwAAABwgTFG4+Pj1NbWRvG4fW4jdMHHG2+8QYsWLQq6GQAAAFCAkZERuvzyy21vE7rgo7GxkYh44+fNmxdwawAAAMCNsbExWrRo0ex+3E7ogg/R1TJv3jwEHwAAABXGzZAJDDgFAACAskLwAQAAAGWF4AMAAADKCsEHAAAAlBWCDwAAACgrBB8AAABQVgg+AAAAoKwQfAAAAEBZeQ4+Xn/9dfq7v/s7mj9/PiWTSfrgBz9IBw8enL2eMUabN2+mtrY2mjt3Lt144410+PBhXxsNAAAAlctT8HHmzBn68Ic/TLW1tfTCCy/Qr3/9a9q6dSu94x3vmL3NF77wBXriiSdo27ZtNDAwQC0tLXTLLbfQ+Pi4320HAACAChRjjDG3N+7p6aGf//zn9NOf/lR7PWOM2traaP369fToo48SEdHU1BQtXLiQPv/5z9M///M/Oz7G2NgYNTU10ejoKMqrAwAAVAgv+29PmY/vfe97lMlk6GMf+xhddtlldPXVV9NXvvKV2euPHj1KJ06coJUrV85eVl9fTzfccAPt379fu82pqSkaGxsznQCgSuRyRIsXE8Vi5lMqFXTLCtPVRVRTw8+tiOecTvPn2tGRf736eohTXZ1+m4kEvz6RsG+feOxcjp8aGvh97NoLUArMg/r6elZfX88ee+wx9sorr7BcLsfmzJnDvvGNbzDGGPv5z3/OiIi9/vrrpvvdf//9bOXKldptbtq0iRFR3ml0dNRL0wCgEqVSjBHpT4Xq72csnebnVtrb+WO0t3u7n5NEgm83kbBul+45y9Jp69dE97pYXZ/N8nZks/mPnU7nP04xzxuAMTY6Oup6/+3pG15bW8uWL19uuuyTn/wku+666xhjRvDxxhtvmG7ziU98gt16663abZ47d46Njo7OnkZGRhB8AESN1Y69FMGH2Kmm09a3UR+nv98IHOzupyM/N3mHb9WuVIr/LQKgTCZ/e1avSW2t/XORX7d4nP8fjzOWTPK/6+oYi8Wst69rN4BLXoIPT90ura2t9L73vc902Z/+6Z/S8PAwERG1tLQQEdGJEydMtzl58iQtXLhQu836+nqaN2+e6QQAEdPXRzQ0xM9lvb28u8BPPT28S6Onh3dT6Lor2tvN5319RNPTvAuip8fb48nPbft2oosX+blVu3p7iY4d4/dhjGhgwLhNQwPR2rXm+2Wz/Hbt7UQXLvBtCLlc/uOILpqZGf5/IkE0Ocn/Pn+eb0uWyRh/79rl+mkDFMNT8PHhD3+Yfvvb35ou+93vfkfpP34ZlixZQi0tLbRnz57Z68+fP0/79u2jFStW+NBcAChaQwPfOTU0lO8x5YBA1t3Nd5LyDpXIvEP0qrub79y7u/nOmsg4F8SOf2jI3L5t2/j9vBD3HRvjr6v6XIh4kNDXx2+rbl8e0yGCBNmOHXxMxh8P8mh4mG+voyM/UFElk/nPXXXggPH3nDn2twXwi5eUyi9/+UtWU1PDent72ZEjR9i3v/1tlkwm2be+9a3Z2/T19bGmpib23e9+lw0ODrJsNstaW1vZ2NiY72kbACiAH10bfpO7GnTjJQpVW5vfXSF3k4juCNEdUsy4B/l1VbtSdF1BduM65FMsZnQJiZPTuBCvp3i8+OcPVa9kYz4YY+w///M/2bJly1h9fT1773vfy7785S+brp+ZmWGbNm1iLS0trL6+nl1//fVscHCwJI0HgAKEMfhgzH68hJ/ksRd+jjcRYzh025ADHq+BQTbLgxl1cKh6WSGn9vbiB9lWuubm/NcECuJl/+2pzkc5oM4HQIl1dRHt3k3U2akfmxB1ogtkfJzo9Gn9bfr7je6Rjg7eNZHJGOMzGhqMLpJs1ngdxW1jMb4rSyaJJibM29aNcUkk+JgT3eXbthFt3Gi0VX48cRsxvsOrRIKPUalmuvcjXLvFilGyOh8AEAF2gyLDQK5FUQpiTEhvLx+f0d+ffxt5YKwYEyGPjZDHZuzebfz9T//Ez8XOSzeGQyaOt7dt048VmZ7OH6S7Y4cxRqShobgd5fS0sa1q1dxs/l8MQoaSQvABAOVnF2BYzYzx+3HlgakqeWCsGPxaW2u0N5k0ru/sNLa7caP+MWVykl8QbYkrP8likG5vr/kxhclJHKUX6/Rp83siBiF7UeqAOYpK3gnkEcZ8AFQBu1ocak0QP4p/uXncYu5nN45EvY+uwJlMHhcianKIAazquBI/T1C4Qj9XEVOyOh8AEJBUqrLLjquspt4S5WckvGRC5GmrXh/XSi7Hx4ekUub7yaXUe3r49efOme+reyx5yqyuzfLzFFkN0eUj7lOMdFoffkDhCvlcVbsyBEOeIPMBoLCbRVENnDIf8iyZUrxOVke1cil19bHt2qBmPtTb22U+xAwXMYXYzSkW4/dLpTCdtlh+ZuEiCLNdAKJEPYoP11c2eDU1RnVSecaIH69TLmeM4+jt5eeiWNiLLxqzhnbvNh47leK3dVusTH5/dW2Wr4/Hib70JX1xMd2MGXxW/LV4Mc/CpdM8OwcmmO0CECUYfW+vs5PveDs7/e1G6OriO/nTp/lp7Vqihx7Sl1IXbchmiU6d8lYlVW6z06q4MzNEGzaYB8ES8c+ImDEjBqaqszigeOhe8Q0yHwCVQFdrAkonl7MuXS52Pl7LsOt0dfGps4LIXsj1N9TMVzyurwkCEDBkPgCi5oor+A7piiuCbknlE4vNyad02pxxkAd9ZjJGhiGTMdLtbqdWWk3DTKfNgQcRUX29kcURslnzbVatcn5MgJBD8FHpnNK0EA1iTIFc0Kpa5XJE8+fzk27n71RzQbfQ2vCw+fWVE8JXXGGsBiuyTl5m4Kxdy2+7dq0R7CxerJ+5MjmZX3l2+3Zz10xYi8MBeIBul0onD7ar9jLJUVbtJdFlYtCfDmPmbgq1FDkRz3zYrfSqbkNcJrNbpVZQu1S8kMu7Q+WSy/ATRX4AMLpdqok82A6iK+wl0cshneZBgZcKlGqmqKPDeYl5K3I3zYsvWldHjcf5bewCD1HJVK65IXevlKK6K5SfU3n9Kobgo9JhpwR+6ugwxkCErVy03E2hliG3ogbl8vosgjqbSFegrKPD/L8IanRdPG6Obt/xDj4ld3zcuO/27TzjkU4TrVgRvtcfvNOVxAciQvABAEIqZa6kWar1VQqhjmm65hrz/8kk33EvXmy+XA3KxRRV2fBwfsCgLvKmBi0iqNGN/RDBi1jZVs5sxOO8rb29RI2NfAqvfF9R3fWHP+TbfeABBCCVbGICVWQtIPgAAO7MGfP/YapnIHefJBJEr7xivv7tt41AQO7KkHV1ER06lL9tXUCyYgV/HFErI5Mxb1OsLPvGG8brJLqFFi3it1OXud+5k1/29ts8yHBTM4Kx8ASA4AwTAFxD8FGpsIoi+E0uSlVbaz2mIQidncZYiunp/B27WI20ttboylCn0+7apa+PIWawyEeo3/kOv60IyA4c4NtQv28XLhivk7xmi5vvprqGjTyL59Zbjem9l17q8kWCwGFWmmsIPipVKZcdh+okLy1+/nzQrTHbvp3/qD/9tP3tLlwwqpGq5szh5/KYDqvqsVaDUtXtJpPGgYC8Xd13c/Vqnk1ZvTp/u7kc0bp1RjX
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"gdf_gowalla = gdf_gowalla[gdf_gowalla['latitude'] < 60]\n",
"gdf_gowalla = gdf_gowalla[gdf_gowalla['latitude'] > 35]\n",
"gdf_gowalla = gdf_gowalla[gdf_gowalla['longitude'] < 30]\n",
"gdf_gowalla = gdf_gowalla[gdf_gowalla['longitude'] > -10]\n",
"\n",
"gdf_gowalla.plot(marker='o', color='red', markersize=1)\n",
"\n",
"df_gowalla = gdf_gowalla\n",
"print(\"Number of unique users in the EU area: \", len(df_gowalla['user id'].unique()))\n",
"\n",
"# remove from memory the geopandas dataframe, it was only used for plotting\n",
"del gdf_gowalla"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Perfect! Now we can create a new .txt file, only with the information that we need"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"# update the file with the new values. Drop the columns that are not needed\n",
"df_gowalla.to_csv(\n",
" os.path.join('test_data', 'gowalla', 'gowalla_checkins.txt'), \n",
" sep='\\t', \n",
" header=False, \n",
" index=False, \n",
" columns=['user id', 'location id'])\n",
"\n",
"# os.remove(os.path.join('test_data', 'brightkite', 'brightkite_checkins_full.txt'))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Foursquare\n",
"\n",
"[Foursquare](https://foursquare.com/) is a location-based social networking website where users share their locations by checking-in. This dataset includes long-term (about 22 months from Apr. 2012 to Jan. 2014) global-scale check-in data collected from Foursquare, and also two snapshots of user social networks before and after the check-in data collection period (see more details in our paper). We will work with three different datasets:\n",
"\n",
"- `foursquare_checkins.txt`: a tsv file with 4 columns: `User ID`, `Venue ID`, `UTC time`, `Timezone offset in minutes` \n",
"\n",
"- `foursquare_friends_edges.txt`: the friendship network, a tsv file with 2 columns of users ids. This is in the form of a graph edge list. \n",
"\n",
"- `raw_POIs.txt`: the POIS, a tsv file with 5 columns: `Venue ID`, `Latitude`, `Longitude`, `Venue category name`, `Country code (ISO)`.\n",
"\n",
"--- \n",
"\n",
"This dataset is by far the biggest of the 3 that we got. The check-in dataset contains 22,809,624 checkins by 114,324 users on 3,820,891 venues. The social network data contains 607,333 friendships. As explained before, we are going to need sub-samples! In this case, we'll take only data from 2012 that have been generated in Italy. Due to the size of the full network, this time we won't plot it, otherwise our RAM might cry. "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Starting to plot\n",
"Number of unique users in Italy: 2555\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAakAAAGdCAYAAACox4zgAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABjuElEQVR4nO2dfZBcVZn/n+5OT5IOSZyOQQxkOloiCgirdlZNUYnvuGsJam0l9CygpbU6EeSlLHQ0Wlj+KjC+4AsC3bWltVWulUCqNPi2i7IrL7rZXTKLaGRRfIFJeFGQhJmQd2bO74+zz9znnj7n3nP79Xb391N1q6e77z197p3u873Pc57zPBmllCIAAAAghWS73QEAAADABUQKAABAaoFIAQAASC0QKQAAAKkFIgUAACC1QKQAAACkFogUAACA1AKRAgAAkFoWdLsDJnNzc/TEE0/Q0qVLKZPJdLs7AAAA2oBSig4ePEirVq2ibNZtL6VOpJ544glavXp1t7sBAACgA+zbt49OO+005/upE6mlS5cSke74smXLutwbAAAA7WBmZoZWr149P+a7SJ1IsYtv2bJlECkAAOhz4qZ1EDgBAAAgtUCkAAAApBaIFAAAgNQCkQIAAJBaIFIAAABSC0QKAABAaoFIAQAASC0QKQAAAKkFIgUAACC1QKQAAACkFogUAACA1AKRAgAAkFogUgC0klqNaM0a/QgAaBqIFADNIoVpYoJoako/AgCaBiIFABHRkiVEmYx+9KVW0/tv3hwI0/g4UamkHwEATQORAoMLW0Cjo0SHD+vX+NHn2MsvD+8/Pk40Nkb06KP6EQDQNBApMDiY80XsmtuxI9inUPBr64oriGZng+eVCoQJgDYAkQKDgzlfND6uXXyzs/qxWiU6dMivrRMnws+3bQs/RwAFAC0BIgXST6sG/PFxonxeC9XQkH5NqfAjkXb/LVigH119GRkJXstm649DAAUALSGjlPx1dp+ZmRlavnw5TU9P07Jly7rdHdAtOFJufJxoyxai/fuJikWiZ55prt1MJvi7VCKamSE6cCB4/uijWmjYlVetht14Q0OBFVUs6n6Zx+VyRDfdFPQfbkAA6vAd62FJgXQSZYk0Y1mVy/oxn9cCsn+/FiIZkbdxY7D/5s1hi0q6+fbv149TU4E1lc3q4xFAAUBLgCUF0om0pIjCVsmaNVoY2IJpB7kc0dyc/jubDSwraUnZ4D7VatoCJCLauhViBYABLCnQ20hLxLRKfNciNWNxsUARacFibrwx/NxEiur+/XrbvFm7GUul5P0AYMCBJQX6k1pNi4PEnF+KQs5dZTJEt9yi/56YIFq3jujHPyY6epRo0SKik04i2rtXB1NkMoFQXXZZWOyYkRFtCQIwwMCSAr1PUkuoWNQiUSza57J8Iu34MyVKadcdz5Pt2kW0dKleyLt0qX5NKf3ZPI82Nka0aVMwVyXZu9fvfAAAECmQYszgiTjR4ii9AwfsrsAnnogWvNHRIMVRqRR26z37rLag2M1oczmar+3apS2pfD78OblcdJg7ACBApYzp6WlFRGp6errbXQHdplpVqlTSj0rpv4n0o43hYf3+8LB+ns/r53JzHauUUrlcsF+5XH9ssRj0i/cdGYnu/9BQ/efL47PZ8DlWq/pzisXgNQD6EN+xHiIFOgMLRj7feBumaPnsbwpNPm8/3iZKti2brX+N2zPbUCr8vFyu3y+TCV8XFuI4QW31tQKgw0CkQLowLZIooWAhaYVVIQd918ZC0ejGfbSJF58TC5R5LUxRc52zFB1TgGyCxOddLNrFyiayAHQQiBRITjvvvqPcbrYBu1RqjVXBg34j4lMoBC7EKMuqUIhuw7ye0pJitx+LWLWqj8lmlapUgmOkq9N0e9rcoPy/5HPn96rVelF2CRkAbQQiBZIjRUHe+bcCHpiz2bCVEOWSa8aSMo+tVPwsIvk5SS2sXE4LjCla7MqTgsduQ57nYuT/IJcLn08SS8p2nNm+PO9mbgQAaACIFEiOOZB3AnPQzOUam3tyubvkIOyyhHK5wGoZGQmLSCNbqWS/lrb9zHN0WVKm6JpBInHXjN+vVALxzWTsogdAB4BIgeS005JyUa2GBaFSiXZf2QZS3p/FplgMouoKBbdImVaMzaqTbjt+zOX09XFZWmwJmvNRcv+kASSm69P2mVHWUFxkJAAdxnesxzopEMDrfKpVotNP1+t41q5tb12ksTGdF4+H223b7GuQbAlnR0eDBbREup3bbtOpiI4f16+tXKlz59ng12s1ohUriD7yEXc/V64MPz/9dJ2FolQKl+0g0rn9JiaIdu/W57R7t379llv0QuNiUadXSsL4eHDs+DjR8HD9Z2azei2X63iUtQe9SIdE0xtYUinAFo4ddwdeqYTdZq2GLSl2x0XNCfFz6c6y7cv9jnPfDQ/XXxM5XySvQSZjD5Zo1zUx14LBUgI9Aiwp0DiTk8Hf5bLfHfiOHdqSkaXYG6FW09YRb5yVgZPMulIKFQq6htMNN+i/MxmixYv1ex/7WP3+N92kH7dvj+/TsmXha5LJhMt5cCaM9et1holDh8I5AkdHtZWTyRAtWRJYpc0Wc9yypT4ju2nxAdDrdEg0vRlYS0paIt2eyLat7YmjEUvKdp62LBHSanEtuk0SNOGzZskMbnD1R2aPcFkxMpOF3C/JPJHtWtnOLZOJbwuAFIDAiV5CDoK5XP9NcvMAywESZnaFuDVTpvBFpSzizzPTEUkxkwO+DI2XAmkKtNkfufCY/2+umwrpgpSuQO5HuVwv8KaoyuCQqPB9Ds6IusHo9k0QAAoi1VvIO/9uWlIyzJmj7Mw+8LxLJqPnh3I5+zwRR9W55pDMz7OlFrKlMHKFhheLYcGx7cODPYuGGd0XRVTaJA7tlo+uuTDb/9bM4xdn3fluLuKyUQDQASBSvYS5UNNchMpC0K7AhLgBOCo0u9kBNKnV6BLDuMwQtkW2SUTK9tnSYpGWjs/5y/Pl/3FU5opGNvMGgNdgyeuW5NoD0EI6IlLXXXedIiJ15ZVXhl7/3//9X/Wud71LLVu2TJ100knqda97nZqamvJqcyBFSiLnGYrF+kHPFlXWLFEDHQuofK0RS0paN7wAVankc1kusYsbsOUiVim+LmxWnhRCs79ysayPNcQ3IJxWSbohW31DYBO/RhZNA9BC2i5S9913n1qzZo0655xzQiL1+9//XhWLRXXNNdeo+++/X/3hD39QP/zhD9Wf//znlna8bzFFKs6SSpp5wIbLJSfzydmEIQmufiW1pMxyGoy8Dj6CJWFLcmjInj7IN7ChWLRbbLbNPBfzekvB8zmnJFsm076lAgB40laROnjwoDr99NPVnXfeqTZs2BASqU2bNqmLL764kWaVUhCpurvrOMyBzzXo24IFyuX6ATWTcc9F2frlEh9TPKPON4klZe5vS6QatZbKZkXZLAybJcWfLQMs+PpF1a6Sa7Gy2eBzXW5WPjf+X9pEb3jY37UYd/4AdIG2itSll16qrrrqKqWUConU7OysOumkk9TnPvc59fa3v12tXLlS/fVf/7XauXOns62jR4+q6enp+W3fvn2DLVI+loUUBtvgZRuEkiZLtX2+OeEuxcDcX7YVZd3JPHJR2NqQ4d88mCdxmZmibb7Oj75zRfJcpLjZrCV5PW2RiIVCEPUn/3fmPBqLs801nPT/C0AHaZtIbd++XZ199tnqyJEjSqmwSD355JOKiFShUFBf/vKX1S9+8Qt1/fXXq0wmo+6++25re9dee60iorqt50TKx81m1kvybce0Ynhw47BpM5pNDqp8V+47eOXz8dm2XbWT5D7cBxZN2+BvWh9R2MTblpyW+yAH9pERt9Xiyg/IbdssJNtWLgfWF/8/WJBs10qef5JktpzRQv7NbuEkNyKFQvgmo1CIvv4AtJi2iNTevXvVySefrB544IH516RIPf7444qIVMVw3bzrXe9SF110kbXNvrCk5EBm3kVLGr2TjRtw4rJ8+7RhioUc6M0BjN1e7NKTczkyeozbkWHtrrv9uEHSFEyb6HD4vvn/cJ2nubbJjLK0WWSuKEz
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import pandas as pd\n",
"import geopandas as gpd\n",
"\n",
"df_foursquare_POIS = pd.read_csv(os.path.join('test_data', 'foursquare', 'raw_POIs.txt'), \n",
" sep='\\t',\n",
" header=None,\n",
" names=['venue id', 'latitude', 'longitude', 'venue category name', 'ISO code'],\n",
" dtype={'venue id': str, 'latitude': float, 'longitude': float, 'venue category name': str, 'ISO code': str},\n",
" engine='c')\n",
"\n",
"df_foursquare_checkins = pd.read_csv(os.path.join('test_data', 'foursquare', 'foursquare_checkins_full.txt'),\n",
" sep='\\t',\n",
" header=None,\n",
" names=['user id', 'venue id', 'UTC time', 'offset'],\n",
" dtype={'user id': str, 'venue id': str, 'UTC time': str, 'offset': int},\n",
" engine='c')\n",
"\n",
"# Take only the data with IT ISO code\n",
"df_foursquare_POIS = df_foursquare_POIS[df_foursquare_POIS['ISO code'] == 'IT']\n",
"\n",
"# Take only the checkins that are in the POIs (filtered by ISO code) and viceversa\n",
"df_foursquare_checkins = df_foursquare_checkins[df_foursquare_checkins['venue id'].isin(df_foursquare_POIS['venue id'])]\n",
"df_foursquare_POIS = df_foursquare_POIS[df_foursquare_POIS['venue id'].isin(df_foursquare_checkins['venue id'])]\n",
"\n",
"# Convert to datetime\n",
"df_foursquare_checkins['UTC time'] = pd.to_datetime(df_foursquare_checkins['UTC time'])\n",
"\n",
"# Take only the data from 2012\n",
"df_foursquare_checkins = df_foursquare_checkins[df_foursquare_checkins['UTC time'].dt.year == 2012]\n",
"\n",
"# convert the dataframe to geopandas dataframe\n",
"gdf_foursquare_POIS = gpd.GeoDataFrame(df_foursquare_POIS, geometry=gpd.points_from_xy(df_foursquare_POIS.longitude, df_foursquare_POIS.latitude))\n",
"\n",
"# plot the geopandas dataframe\n",
"print(\"Starting to plot\")\n",
"gdf_foursquare_POIS.plot(marker='o', color='red', markersize=1)\n",
"print('Number of unique users in Italy: ', len(df_foursquare_checkins['user id'].unique()))\n",
"\n",
"# delete from memory the geo dataframe, it was only used for plotting\n",
"del gdf_foursquare_POIS"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"df_foursquare_checkins.to_csv(\n",
" os.path.join('test_data', 'foursquare', 'foursquare_checkins.txt'),\n",
" sep='\\t',\n",
" header=False,\n",
" index=False,\n",
" columns=['user id', 'venue id'])\n",
"\n",
"# os.remove(os.path.join('test_data', 'foursquare', 'foursquare_checkins_full.txt'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Building the networks"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"We are going to build the the networks for the three datasets as an undirected graph $M = (V, E)$, where $V$ is the set of nodes and $E$ is the set of edges. The nodes represent the users and the edges indicates that two individuals visited the same location at least once.\n",
"\n",
"The check-ins files of the three datasets are not in the form of a graph edge list, so we need to manipulate them. Let's have a look at the number of lines of each file."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"gowalla\n",
"Number of lines: 83269\n",
"Number of unique elements: 3718\n",
"\n",
"brightkite\n",
"Number of lines: 391220\n",
"Number of unique elements: 8525\n",
"\n",
"foursquare\n",
"Number of lines: 125076\n",
"Number of unique elements: 2555\n",
"\n"
]
}
],
"source": [
"def count_lines_and_unique_elements(file):\n",
" df = pd.read_csv(file, sep='\\t', header=None)\n",
" print('Number of lines: ', len(df))\n",
" print('Number of unique elements: ', len(df[0].unique()))\n",
"\n",
"gowalla_path = os.path.join('test_data', 'gowalla', 'gowalla_checkins.txt')\n",
"brightkite_path = os.path.join('test_data', 'brightkite', 'brightkite_checkins.txt')\n",
"foursquare_path = os.path.join('test_data', 'foursquare', 'foursquare_checkins.txt')\n",
"\n",
"_ = [gowalla_path, brightkite_path, foursquare_path]\n",
"\n",
"for path in _:\n",
" print(path.split(os.sep)[-2])\n",
" count_lines_and_unique_elements(path)\n",
" print()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"We would like to build a graph starting from an edge list. To do that, we are going to check, for each venue, all the users that visited it. Then, we will create an edge between each pair of users that visited the same venue (avoiding repetitions). This can be easily done in python, but it's going to be a bit slow (this is why we are considering sub-samples of the datasets). Let's see how to do it.\n",
"\n",
"```python\n",
"# let df be the dataframe [\"user_id\", \"venue_id\"] of the checkins\n",
"\n",
"venues_users = df.groupby(\"venue_id\")[\"user_id\"].apply(set)\n",
"\n",
" for users in venues_users:\n",
" for user1, user2 in combinations(users, 2):\n",
" G.add_edge(user1, user2)\n",
"```\n",
"\n",
"It the `utilis.py` module, we have a function that does exactly this called `create_graph_from_checkins`. It takes as input the name of the dataset and returns a networkx graph object. By default it will also write the edge list to a file in the respective dataset folder. The options are\n",
"\n",
"- `brightkite`\n",
"- `gowalla`\n",
"- `foursquare`\n",
"\n",
"Let's see how it works:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Creating the graph for the dataset brightkite...\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 84831/84831 [00:00<00:00, 273307.92it/s]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done! The graph has 292973 edges and 6493 nodes\n",
"\n",
"Creating the graph for the dataset gowalla...\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 31095/31095 [00:00<00:00, 292747.37it/s]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done! The graph has 62790 edges and 3073 nodes\n",
"\n",
"Creating the graph for the dataset foursquare...\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 40650/40650 [00:00<00:00, 102938.86it/s]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done! The graph has 246702 edges and 2324 nodes\n"
]
}
],
"source": [
"# It takes about 3 minutes to create the all the 4 graphs on a i7-8750H CPU\n",
"\n",
"G_brighkite_checkins = create_graph_from_checkins('brightkite')\n",
"G_brighkite_checkins.name = 'Brightkite Checkins Graph'\n",
"\n",
"G_gowalla_checkins = create_graph_from_checkins('gowalla')\n",
"G_gowalla_checkins.name = 'Gowalla Checkins Graph'\n",
"\n",
"G_foursquare_checkins = create_graph_from_checkins('foursquare')\n",
"G_foursquare_checkins.name = 'Foursquare Checkins Graph'"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Friendship network\n",
"\n",
"Now we want to create a graph where two users are connected if they are friends in the social network. We are intending the concept of friendship in a \"facebook way\", not a \"twitter way\". Less empirically, the graphs is not going to be directed and the edges are not going to be weighted. A user can't be friend with himself, and can't be friend with a user without the user being friend with him.\n",
"\n",
"Since we filtered the checkins for foursquare and gowalla, we are considering only the users that are also present in the check-ins graph. We can build this graph with the function `create_friendships_graph` in the `utils.py` module. It takes as input the name of the dataset and returns a networkx graph object. By default it will also write the edge list to a file in the respective dataset folder. The options are\n",
"\n",
"- `brightkite`\n",
"- `gowalla`\n",
"- `foursquare`\n",
"\n",
"> **NOTE:** This functions is implemented without the necessity of the checkins graphs being loaded in memory, it uses the edge list file. This choice was made since someone may want to perform some analysis only on the friendship network and so there is no need to load the checkins graph and waste memory. Furthermore, networkx is tremendously slow when loading a graph from an edge list file, so this choice is also motivated by the speed of the function.\n",
"\n",
"Let's see how it works:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Computation done for Brightkite friendship graph\n",
"Computation done for Gowalla friendship graph\n",
"Computation done for Foursquare friendship graph\n"
]
}
],
"source": [
"G_brighkite_friends = create_friendships_graph('brightkite')\n",
"print(\"Computation done for Brightkite friendship graph\")\n",
"G_brighkite_friends.name = 'Brightkite Friendship Graph'\n",
"\n",
"\n",
"G_gowalla_friends = create_friendships_graph('gowalla')\n",
"print(\"Computation done for Gowalla friendship graph\")\n",
"G_gowalla_friends.name = 'Gowalla Friendship Graph'\n",
"\n",
"\n",
"G_foursquare_friends = create_friendships_graph('foursquare')\n",
"print(\"Computation done for Foursquare friendship graph\")\n",
"G_foursquare_friends.name = 'Foursquare Friendship Graph'"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have our graphs, let's have a look at some basic information about them"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Brightkite Friendship Graph\n",
"Number of nodes: 5420\n",
"Number of edges: 14690\n",
"\n",
"(Filtered) Gowalla Friendship Graph\n",
"Number of nodes: 2294\n",
"Number of edges: 5548\n",
"\n",
"Foursquare Friendship Graph\n",
"Number of nodes: 1397\n",
"Number of edges: 5323\n",
"\n"
]
}
],
"source": [
"for G in [G_brighkite_friends, G_gowalla_friends, G_foursquare_friends]:\n",
" print(G.name)\n",
" print('Number of nodes: ', G.number_of_nodes())\n",
" print('Number of edges: ', G.number_of_edges())\n",
" print()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Proprieties of the structure of the networks\n",
"<!-- \n",
"Given a social network, which of its nodes are more central? This question has been asked many times in sociology, psychology and computer science, and a whole plethora of centrality measures (a.k.a. centrality indices, or rankings) were proposed to account for the importance of the nodes of a network. \n",
"\n",
"These networks, typically generated directly or indirectly by human activity and interaction (and therefore hereafter dubbed \"social\"), appear in a large variety of contexts and often exhibit a surprisingly similar structure. One of the most important notions that researchers have been trying to capture in such networks is “node centrality”: ideally, every node (often representing an individual) has some degree of influence or importance within the social domain under consideration, and one expects such importance to surface in the structure of the social network; centrality is a quantitative measure that aims at revealing the importance of a node. \n",
"\n",
"Among the types of centrality that have been considered in the literature, many have to do with distances between nodes. Take, for instance, a node in an undirected connected network: if the sum of distances to all other nodes is large, the node under consideration is peripheral; this is the starting point to define Bavelas's closeness centrality, which is the reciprocal of peripherality (i.e., the reciprocal of the sum of distances to all other nodes). \n",
"\n",
"The role played by shortest paths is justified by one of the most well-known features of complex networks, the so-called **small-world phenomenon**. A small-world network is a graph where the average distance between nodes is logarithmic in the size of the network, whereas the clustering coefficient is larger (that is, neighborhoods tend to be denser) than in a random Erdős-Rényi graph with the same size and average distance. The fact that social networks (whether electronically mediated or not) exhibit the small-world property is known at least since Milgram's famous experiment and is arguably the most popular of all features of complex networks. For instance, the average distance of the Facebook graph was recently established to be just $4.74$ \n",
"\n",
"--- \n",
"\n",
"In 1998 Watts and Strogatz proposed a simple model for generating networks with the small-world property. The model is based on a regular lattice of $N$ nodes, where each node is connected to its $k$ nearest neighbors. The model then proceeds as follows: for each edge, the probability $p$ of rewiring it is considered. If the edge is rewired, it is replaced by a random edge with uniform probability. The resulting network is a small-world network with $N$ nodes, $k$ nearest neighbors, and average distance $\\log(N)/\\log(k)$. -->"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Introduzione da scrivere\n",
"\n",
"To help us visualize the results of our analysis we can create a dataframe and fill it with all the information that we will retrive from our networks in this section.\n",
"\n",
"As we'll see in the cells below, the full networks are very big, even after the filtering that we did. This leads to long run times for the functions that we are going to use. To avoid this, we are going to use a sub-sample of the networks. Depending on how much we want to sample, our results will be more or less accurate. \n",
"\n",
"What I suggest to do while reviewing this network is to use higher values for the sampling rate, so that you can see the results faster. This will give you a general idea of how the implemented functions work. Then, at the end of this section I have provided a link from my GitHub repository where you can download the results obtained with very low sampling rates. In this way you can test the functions with mock-networks and see if they work as expected, then we can proceed with the analysis using the more accurate results that required more time to compute."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"checkins_graphs = [G_brighkite_checkins, G_gowalla_checkins, G_foursquare_checkins]\n",
"friendships_graph = [G_brighkite_friends, G_gowalla_friends, G_foursquare_friends]\n",
"\n",
"graphs_all = checkins_graphs + friendships_graph"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Graph</th>\n",
" <th>Number of Nodes</th>\n",
" <th>Number of Edges</th>\n",
" <th>Average Degree</th>\n",
" <th>Average Clustering Coefficient</th>\n",
" <th>log N</th>\n",
" <th>Average Shortest Path Length</th>\n",
" <th>betweenness centrality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Brightkite Checkins Graph</td>\n",
" <td>6493</td>\n",
" <td>292973</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>8.778480</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Gowalla Checkins Graph</td>\n",
" <td>3073</td>\n",
" <td>62790</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>8.030410</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Foursquare Checkins Graph</td>\n",
" <td>2324</td>\n",
" <td>246702</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>7.751045</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Brightkite Friendship Graph</td>\n",
" <td>5420</td>\n",
" <td>14690</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>8.597851</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>(Filtered) Gowalla Friendship Graph</td>\n",
" <td>2294</td>\n",
" <td>5548</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>7.738052</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Foursquare Friendship Graph</td>\n",
" <td>1397</td>\n",
" <td>5323</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>7.242082</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Graph Number of Nodes Number of Edges \\\n",
"0 Brightkite Checkins Graph 6493 292973 \n",
"1 Gowalla Checkins Graph 3073 62790 \n",
"2 Foursquare Checkins Graph 2324 246702 \n",
"3 Brightkite Friendship Graph 5420 14690 \n",
"4 (Filtered) Gowalla Friendship Graph 2294 5548 \n",
"5 Foursquare Friendship Graph 1397 5323 \n",
"\n",
" Average Degree Average Clustering Coefficient log N \\\n",
"0 NaN NaN 8.778480 \n",
"1 NaN NaN 8.030410 \n",
"2 NaN NaN 7.751045 \n",
"3 NaN NaN 8.597851 \n",
"4 NaN NaN 7.738052 \n",
"5 NaN NaN 7.242082 \n",
"\n",
" Average Shortest Path Length betweenness centrality \n",
"0 NaN NaN \n",
"1 NaN NaN \n",
"2 NaN NaN \n",
"3 NaN NaN \n",
"4 NaN NaN \n",
"5 NaN NaN "
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"analysis_results = pd.DataFrame(columns=['Graph', 'Number of Nodes', 'Number of Edges', 'Average Degree', 'Average Clustering Coefficient', 'log N', 'Average Shortest Path Length', 'betweenness centrality', 'omega-coefficient'], index=None)\n",
"\n",
"for graph in graphs_all:\n",
" analysis_results = analysis_results.append(\n",
" {'Graph': graph.name, \n",
" 'Number of Nodes': graph.number_of_nodes(), \n",
" 'log N': np.log(graph.number_of_nodes()),\n",
" 'Number of Edges': graph.number_of_edges()}, \n",
" ignore_index=True)\n",
"\n",
"analysis_results"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Average Degree\n",
"\n",
"The degree of a node is the number of links connected to it. The average degree alone, is not very useful for our future analysis, so we won't spend much time about it. In the next section we will see that the degree distribution is a much more useful measure.\n",
"\n",
"The degree distribution, $P(k)$, is the fraction of sites having degree $k$. We know from the literature that many real networks do not exhibit a Poisson degree distribution, as predicted in the ER model. In fact, many of them exhibit a distribution with a long, power-law, tail, $P(k) \\sim k^{-\\gamma}$ with some $γ$, usually between $2$ and $3$.\n",
"\n",
2 years ago
"For know, we will just compute the average degree of our networks and add it to the dataframe."
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Graph</th>\n",
" <th>Number of Nodes</th>\n",
" <th>Number of Edges</th>\n",
" <th>Average Degree</th>\n",
" <th>Average Clustering Coefficient</th>\n",
" <th>log N</th>\n",
" <th>Average Shortest Path Length</th>\n",
" <th>betweenness centrality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Brightkite Checkins Graph</td>\n",
" <td>6493</td>\n",
" <td>292973</td>\n",
" <td>90.242723</td>\n",
" <td>NaN</td>\n",
" <td>8.778480</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Gowalla Checkins Graph</td>\n",
" <td>3073</td>\n",
" <td>62790</td>\n",
" <td>40.865604</td>\n",
" <td>NaN</td>\n",
" <td>8.030410</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Foursquare Checkins Graph</td>\n",
" <td>2324</td>\n",
" <td>246702</td>\n",
" <td>212.30809</td>\n",
" <td>NaN</td>\n",
" <td>7.751045</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Brightkite Friendship Graph</td>\n",
" <td>5420</td>\n",
" <td>14690</td>\n",
" <td>5.420664</td>\n",
" <td>NaN</td>\n",
" <td>8.597851</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>(Filtered) Gowalla Friendship Graph</td>\n",
" <td>2294</td>\n",
" <td>5548</td>\n",
" <td>4.836966</td>\n",
" <td>NaN</td>\n",
" <td>7.738052</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Foursquare Friendship Graph</td>\n",
" <td>1397</td>\n",
" <td>5323</td>\n",
" <td>7.620616</td>\n",
" <td>NaN</td>\n",
" <td>7.242082</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Graph Number of Nodes Number of Edges \\\n",
"0 Brightkite Checkins Graph 6493 292973 \n",
"1 Gowalla Checkins Graph 3073 62790 \n",
"2 Foursquare Checkins Graph 2324 246702 \n",
"3 Brightkite Friendship Graph 5420 14690 \n",
"4 (Filtered) Gowalla Friendship Graph 2294 5548 \n",
"5 Foursquare Friendship Graph 1397 5323 \n",
"\n",
" Average Degree Average Clustering Coefficient log N \\\n",
"0 90.242723 NaN 8.778480 \n",
"1 40.865604 NaN 8.030410 \n",
"2 212.30809 NaN 7.751045 \n",
"3 5.420664 NaN 8.597851 \n",
"4 4.836966 NaN 7.738052 \n",
"5 7.620616 NaN 7.242082 \n",
"\n",
" Average Shortest Path Length betweenness centrality \n",
"0 NaN NaN \n",
"1 NaN NaN \n",
"2 NaN NaN \n",
"3 NaN NaN \n",
"4 NaN NaN \n",
"5 NaN NaN "
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for G in graphs_all:\n",
" avg_deg = np.mean([d for n, d in G.degree()])\n",
" analysis_results.loc[analysis_results['Graph'] == G.name, 'Average Degree'] = avg_deg\n",
"\n",
"analysis_results"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clustering coefficient\n",
"\n",
"The clustering coefficient is usually related to a community represented by local structures. The usual definition of clustering is related to the number of triangles in the network. The clustering is high if two nodes sharing a neighbor have a high probability of being connected to each other. There are two common definitions of clustering. The first is global,\n",
"\n",
"\\begin{equation}\n",
" C = \\frac{3 \\times \\text{the number of triangles in the network}}{\\text{the number of connected triples of vertices}}\n",
"\\end{equation}\n",
"\n",
"where a “connected triple” means a single vertex with edges running to an unordered\n",
"pair of other vertices. \n",
"\n",
"A second definition of clustering is based on the average of the clustering for single nodes. The clustering for a single node is the fraction of pairs of its linked neighbors out of the total number of pairs of its neighbors:\n",
"\n",
"\\begin{equation}\n",
" C_i = \\frac{\\text{the number of triangles connected to vertex }i}{\\text{the number of triples centered on vertex } i}\n",
"\\end{equation}\n",
"\n",
"For vertices with degree $0$ or $1$, for which both numerator and denominator are zero, we use $C_i = 0$. Then the clustering coefficient for the whole network is the average\n",
"\n",
"\\begin{equation}\n",
" C = \\frac{1}{n} \\sum_{i} C_i\n",
"\\end{equation}\n",
"\n",
"In both cases the clustering is in the range $0 \\leq C \\leq 1$. \n",
"\n",
"In random graph models such as the ER model and the configuration model, the clustering coefficient is low and decreases to $0$ as the system size increases. This is also the situation in many growing network models. However, in many real-world networks the clustering coefficient is rather high and remains constant for large network sizes. This observation led to the introduction of the small-world model, which offers a combination of a regular lattice with high clustering and a random graph. \n",
"\n",
"---\n",
"\n",
"As one can imagine by the definition given above, this operation is very expensive. The library `networkx` provides a function to compute the clustering coefficient of a graph. In particular, the function `average_clustering` computes the average clustering coefficient of a graph. \n",
"\n",
"<!-- Unfortunately, since our dataset (even after sub-sampling) are too big to be processed exactly in decent times.\n",
"\n",
"We can use the `average_clustering` function from the `utils` module to compute the average clustering coefficient on a random sub-sample of the graph. The functions takes as input:\n",
"\n",
"- `G: networkx graph object`: the graph on which we want to compute the average clustering coefficient\n",
"- `k: int (default=None)`: percentage of nodes to remove from the graph. If k is None, the average clustering coefficient of each connected component is computed using all the nodes of the connected component.\n",
"\n",
"And returns:\n",
"\n",
"- `float`: the average clustering coefficient of the graph\n",
"\n",
"Depending on the machine and the time available, we can choose different values for `k`. Lower values will give us a more precise result, but will take longer to compute. On the other hand, higher values will give us a less precise result, but will be faster to compute. I suggest to use `k=0.9` to test very quickly the function, and at least `k=0.6` to get a more precise result.\n",
"\n",
"> Since the checkins graphs are way bigger then the friendship graphs, I created two for loop to compute the average clustering coefficient with different values of `k`. -->"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Computing average clustering coefficient for the Brightkite Checkins Graph...\n",
"\tAverage clustering coefficient: 0.7139988006862793\n",
"\tCPU time: 14.8 seconds\n",
"\n",
"Computing average clustering coefficient for the Gowalla Checkins Graph...\n",
"\tAverage clustering coefficient: 0.5483724940778376\n",
"\tCPU time: 1.7 seconds\n",
"\n",
"Computing average clustering coefficient for the Foursquare Checkins Graph...\n",
"\tAverage clustering coefficient: 0.6527297407924693\n",
"\tCPU time: 19.5 seconds\n",
"\n",
"Computing average clustering coefficient for the Brightkite Friendship Graph...\n",
"\tAverage clustering coefficient: 0.21857061612676437\n",
"\tCPU time: 0.1 seconds\n",
"\n",
"Computing average clustering coefficient for the (Filtered) Gowalla Friendship Graph...\n",
"\tAverage clustering coefficient: 0.23429345031911422\n",
"\tCPU time: 0.0 seconds\n",
"\n",
"Computing average clustering coefficient for the Foursquare Friendship Graph...\n",
"\tAverage clustering coefficient: 0.18348521948916247\n",
"\tCPU time: 0.0 seconds\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Graph</th>\n",
" <th>Number of Nodes</th>\n",
" <th>Number of Edges</th>\n",
" <th>Average Degree</th>\n",
" <th>Average Clustering Coefficient</th>\n",
" <th>log N</th>\n",
" <th>Average Shortest Path Length</th>\n",
" <th>betweenness centrality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Brightkite Checkins Graph</td>\n",
" <td>6493</td>\n",
" <td>292973</td>\n",
" <td>90.242723</td>\n",
" <td>0.713999</td>\n",
" <td>8.778480</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Gowalla Checkins Graph</td>\n",
" <td>3073</td>\n",
" <td>62790</td>\n",
" <td>40.865604</td>\n",
" <td>0.548372</td>\n",
" <td>8.030410</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Foursquare Checkins Graph</td>\n",
" <td>2324</td>\n",
" <td>246702</td>\n",
" <td>212.30809</td>\n",
" <td>0.65273</td>\n",
" <td>7.751045</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Brightkite Friendship Graph</td>\n",
" <td>5420</td>\n",
" <td>14690</td>\n",
" <td>5.420664</td>\n",
" <td>0.218571</td>\n",
" <td>8.597851</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>(Filtered) Gowalla Friendship Graph</td>\n",
" <td>2294</td>\n",
" <td>5548</td>\n",
" <td>4.836966</td>\n",
" <td>0.234293</td>\n",
" <td>7.738052</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Foursquare Friendship Graph</td>\n",
" <td>1397</td>\n",
" <td>5323</td>\n",
" <td>7.620616</td>\n",
" <td>0.183485</td>\n",
" <td>7.242082</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Graph Number of Nodes Number of Edges \\\n",
"0 Brightkite Checkins Graph 6493 292973 \n",
"1 Gowalla Checkins Graph 3073 62790 \n",
"2 Foursquare Checkins Graph 2324 246702 \n",
"3 Brightkite Friendship Graph 5420 14690 \n",
"4 (Filtered) Gowalla Friendship Graph 2294 5548 \n",
"5 Foursquare Friendship Graph 1397 5323 \n",
"\n",
" Average Degree Average Clustering Coefficient log N \\\n",
"0 90.242723 0.713999 8.778480 \n",
"1 40.865604 0.548372 8.030410 \n",
"2 212.30809 0.65273 7.751045 \n",
"3 5.420664 0.218571 8.597851 \n",
"4 4.836966 0.234293 7.738052 \n",
"5 7.620616 0.183485 7.242082 \n",
"\n",
" Average Shortest Path Length betweenness centrality \n",
"0 NaN NaN \n",
"1 NaN NaN \n",
"2 NaN NaN \n",
"3 NaN NaN \n",
"4 NaN NaN \n",
"5 NaN NaN "
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for graph in graphs_all:\n",
" print(\"\\nComputing average clustering coefficient for the {}...\".format(graph.name))\n",
" start = time.time()\n",
" avg_clustering = nx.average_clustering(graph)\n",
" end = time.time()\n",
"\n",
" print(\"\\tAverage clustering coefficient: {}\".format(avg_clustering))\n",
" print(\"\\tCPU time: \" + str(round(end-start,1)) + \" seconds\")\n",
" analysis_results.loc[analysis_results['Graph'] == graph.name, 'Average Clustering Coefficient'] = avg_clustering\n",
"\n",
"analysis_results"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Average Path Length\n",
"\n",
"Since we are considering our networks as _not_ embedded in real space (even if we could theoretically), the geometrical distance between nodes is meaningless. The most important distance measure in such networks is the minimal number of hops (or chemical distance). That is, the distance between two nodes in the network is defined as the number of edges in the shortest path between them. If the edges are assumed to be weighted, the lowest total weight path, called the _optimal path_, may also be used. The usual mathematical definition of the diameter of the network is the length of the path between the farthest nodes in the network.\n",
"\n",
"In the next section, we'll see how to characterize this distance in a small world network. \n",
"\n",
"--- \n",
"\n",
"The `networkx` library provides a function to compute the average shortest path length of a graph. In particular, the function `average_shortest_path_length` computes the average shortest path length of a graph. Unfortunately, as always, there are some limitations. The function can only be applied to connected graphs and since we are taking sub-samples of our datasets, there is a probability that the sub-sample is not connected. Another problem is that this operation is very expensive! The shortest path length is defined as\n",
"\n",
"$$ a = \\sum_{s \\in V} \\sum_{t \\in V} \\frac{d(s,t)}{n(n-1)} $$\n",
"\n",
"Where $V$ is the set of nodes in the graph, $n$ is the number of nodes in the graph, and $d(s,t)$ is the shortest path length between nodes $s$ and $t$. The default (and we are going to use) algorithm to compute the shortest path length is the Dijkstra algorithm. \n",
"\n",
"Since we are interested in the average shortest path length of all our connected components, for each node we need to run the Dijkstra algorithm on all the other nodes. Given the dimensions of our datasets and the slowness of networkx, computing the average shortest path length of the whole graph is not feasible.\n",
"\n",
"To overcome this problem, we can use the `average_shortest_path` function from the `utils` module to compute the average shortest path length on a random sub-sample of the graph. The functions takes as input:\n",
"\n",
"- `G: networkx graph object`: the graph on which we want to compute the average shortest path length\n",
"- `k: int (default=None)`: percentage of nodes to remove from the graph. If k is None, the average shortest path length of each connected component is computed using all the nodes of the connected component.\n",
"\n",
"And returns:\n",
"\n",
"- `float`: the average shortest path length of the graph\n",
"\n",
"The implementation is very straightforward. First we remove a random sub-sample of the nodes from the graph. Then we create a list with all the connected components of the sub-sampled graph with at least 10 nodes and finally we compute the average shortest path length using the networkx function `average_shortest_path_length`. The choice of 10 nodes is arbitrary and based on empirical observations. We do that to avoid creating small communities with a very low average shortest path length that could bias our results.\n",
"\n",
"Depending on the machine and the time available, we can choose different values for `k`. Lower values will give us a more precise result, but will take longer to compute. On the other hand, higher values will give us a less precise result, but will be faster to compute. However, this time this is not a too much time consuming operation. So if we are willing to wait a bit, I suggest to use `k=None` and test it one the whole graph.\n",
"\n",
"\n",
"\n",
"\n",
"<!-- We have seen how we can characterize the clustering in a small world network. Now we can see the second important property of small-world networks is their small diameter, i.e., the small distance between nodes in the network. The distance in the underlying lattice behaves as the linear length of the lattice, L. Since $N \\sim L^d$ where $d$ is the lattice dimension, it follows that the distance between nodes behaves as:\n",
"\n",
"\\begin{equation}\n",
" l \\sim L \\sim N^{1/d}\n",
"\\end{equation}\n",
"\n",
"Therefore, the underlying lattice has a finite dimension, and the distances on it behave as a power law of the number of nodes, i.e., the distance between nodes is large. However, when adding even a small fraction of shortcuts to the network, this behavior changes dramatically. \n",
"\n",
"Let's try to deduce the behavior of the average distance between nodes. Consider a small-world network, with dimension d and connecting distance $k$ (i.e., every node is connected to any other node whose distance from it in every linear dimension is at most $k$). Now, consider the nodes reachable from a source node with at most $r$ steps. When $r$ is small, these are just the \\emph{r-th} nearest neighbors of the source in the underlying lattice. We term the set of these neighbors a “patch”. the radius of which is $kr$ , and the number of nodes it contains is approximately $n(r) = (2kr)d$. \n",
"\n",
"We now want to find the distance r for which such a patch will contain about one shortcut. This will allow us to consider this patch as if it was a single node in a randomly connected network. Assume that the probability for a single node to have a shortcut is $\\Phi$. To find the length for which approximately one shortcut is encountered, we need to solve for $r$ the following equation: $(2kr)^d \\Phi = 1$. The correlation length $\\xi$ defined as the distance (or linear size of a patch) for which a shortcut will be encountered with high probability is therefore,\n",
"\n",
"\\begin{equation}\n",
" \\xi = \\frac{1}{k \\Phi^{1/d}}\n",
"\\end{equation}\n",
"\n",
"Note that we have omitted the factor 2, since we are interested in the order of magnitude. Let us denote by $V(r)$ the total number of nodes reachable from a node by at most $r$ steps, and by $a(r)$, the number of nodes added to a patch in the \\emph{r-th} step. That is, $a(r) = n(r) - n(r-1)$. Thus,\n",
"\n",
"\\begin{equation}\n",
" a(r) \\sim \\frac{\\text{d} n(r)}{\\text{d} r} = 2kd(2kr)^{d-1}\n",
"\\end{equation}\n",
"\n",
"When a shortcut is encountered at the r step from a node, it leads to a new patch \\footnote{It may actually lead to an already encountered patch, and two patches may also merge after some steps, but this occurs with negligible probability when $N \\to \\infty$ until most of the network is reachable}. This new patch occurs after $r'$ steps, and therefore the number of nodes reachable from its origin is $V (r - r')$. Thus, we obtain the recursive relation\n",
"\n",
"\\begin{equation} \n",
" V(r) = \\sum_{r'=0}^r a(r') [1 + \\xi^{-d}V(r-r')]\n",
"\\end{equation}\n",
"\n",
"where the first term stands for the size of the original patch, and the second term is derived from the probability of hitting a shortcut, which is approximately $\\xi -d $ for every new node encountered. To simplify the solution of \\ref{eq:recursion}, it can be approximated by a differential equation. The sum can be approximated by an integral, and then the equation can be differentiated with respect to $r$ . For simplicity, we will concentrate here on the solution for the one-dimensional case, with $k = 1$, where $a(r) = 2$. Thus, one obtains\n",
"\n",
"\\begin{equation}\n",
" \\frac{\\text{d} V(r)}{\\text{d} r} = 2 [1 + V(r)/\\xi]\n",
"\\end{equation}\n",
"\n",
"the solution of which is:\n",
"\n",
"\\begin{equation} \n",
" V(r) = \\xi \\left(e^{2r/\\xi} -1\\right)\n",
"\\end{equation}\n",
"\n",
"For $r \\ll \\xi$, the exponent can be expanded in a power series, and one obtains $V(r) \\sim 2r = n(r)$, as expected, since usually no shortcut is encountered. For $r \\ gg \\xi$, $V(r)$. An approximation for the average distance between nodes can be obtained by equating $V(r)$ from \\ref*{eq:V(r)} to the total number of nodes, $V(r) = N$. This results in\n",
"\n",
"\\begin{equation} \n",
" r \\sim \\frac{\\xi}{2} \\ln \\frac{N}{\\xi} \n",
"\\end{equation}\n",
" -->\n"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Computing average shortest path length for graph: Brightkite Checkins Graph\n",
"\tNumber of connected components with more then 10 nodes: 1 \n",
"\tAverage shortest path length: 3.01ngth of connected component with 6349 nodes and 292842 edges \n",
"\tCPU time: 119.2 seconds\n",
"\n",
"Computing average shortest path length for graph: Gowalla Checkins Graph\n",
"\tNumber of connected components with more then 10 nodes: 1 \n",
"\tAverage shortest path length: 3.51ngth of connected component with 3010 nodes and 62754 edges \n",
"\tCPU time: 17.7 seconds\n",
"\n",
"Computing average shortest path length for graph: Foursquare Checkins Graph\n",
"\tNumber of connected components with more then 10 nodes: 1 \n",
"\tAverage shortest path length: 2.19ngth of connected component with 2303 nodes and 246690 edges \n",
"\tCPU time: 29.8 seconds\n",
"\n",
"Computing average shortest path length for graph: Brightkite Friendship Graph\n",
"\tNumber of connected components with more then 10 nodes: 1 \n",
"\tAverage shortest path length: 5.23ngth of connected component with 5040 nodes and 14456 edges \n",
"\tCPU time: 25.1 seconds\n",
"\n",
"Computing average shortest path length for graph: (Filtered) Gowalla Friendship Graph\n",
"\tNumber of connected components with more then 10 nodes: 1 \n",
"\tAverage shortest path length: 5.4ength of connected component with 1991 nodes and 5287 edges \n",
"\tCPU time: 3.5 seconds\n",
"\n",
"Computing average shortest path length for graph: Foursquare Friendship Graph\n",
"\tNumber of connected components with more then 10 nodes: 2 \n",
"\tAverage shortest path length: 6.46ngth of connected component with 12 nodes and 13 edges ges \n",
"\tCPU time: 1.3 seconds\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Graph</th>\n",
" <th>Number of Nodes</th>\n",
" <th>Number of Edges</th>\n",
" <th>Average Degree</th>\n",
" <th>Average Clustering Coefficient</th>\n",
" <th>log N</th>\n",
" <th>Average Shortest Path Length</th>\n",
" <th>betweenness centrality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Brightkite Checkins Graph</td>\n",
" <td>6493</td>\n",
" <td>292973</td>\n",
" <td>90.242723</td>\n",
" <td>0.713999</td>\n",
" <td>8.778480</td>\n",
" <td>3.013369</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Gowalla Checkins Graph</td>\n",
" <td>3073</td>\n",
" <td>62790</td>\n",
" <td>40.865604</td>\n",
" <td>0.548372</td>\n",
" <td>8.030410</td>\n",
" <td>3.508031</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Foursquare Checkins Graph</td>\n",
" <td>2324</td>\n",
" <td>246702</td>\n",
" <td>212.30809</td>\n",
" <td>0.65273</td>\n",
" <td>7.751045</td>\n",
" <td>2.186112</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Brightkite Friendship Graph</td>\n",
" <td>5420</td>\n",
" <td>14690</td>\n",
" <td>5.420664</td>\n",
" <td>0.218571</td>\n",
" <td>8.597851</td>\n",
" <td>5.231807</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>(Filtered) Gowalla Friendship Graph</td>\n",
" <td>2294</td>\n",
" <td>5548</td>\n",
" <td>4.836966</td>\n",
" <td>0.234293</td>\n",
" <td>7.738052</td>\n",
" <td>5.396488</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Foursquare Friendship Graph</td>\n",
" <td>1397</td>\n",
" <td>5323</td>\n",
" <td>7.620616</td>\n",
" <td>0.183485</td>\n",
" <td>7.242082</td>\n",
" <td>6.45841</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Graph Number of Nodes Number of Edges \\\n",
"0 Brightkite Checkins Graph 6493 292973 \n",
"1 Gowalla Checkins Graph 3073 62790 \n",
"2 Foursquare Checkins Graph 2324 246702 \n",
"3 Brightkite Friendship Graph 5420 14690 \n",
"4 (Filtered) Gowalla Friendship Graph 2294 5548 \n",
"5 Foursquare Friendship Graph 1397 5323 \n",
"\n",
" Average Degree Average Clustering Coefficient log N \\\n",
"0 90.242723 0.713999 8.778480 \n",
"1 40.865604 0.548372 8.030410 \n",
"2 212.30809 0.65273 7.751045 \n",
"3 5.420664 0.218571 8.597851 \n",
"4 4.836966 0.234293 7.738052 \n",
"5 7.620616 0.183485 7.242082 \n",
"\n",
" Average Shortest Path Length betweenness centrality \n",
"0 3.013369 NaN \n",
"1 3.508031 NaN \n",
"2 2.186112 NaN \n",
"3 5.231807 NaN \n",
"4 5.396488 NaN \n",
"5 6.45841 NaN "
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for graph in graphs_all:\n",
" print(\"\\nComputing average shortest path length for graph: \", graph.name)\n",
"\n",
" start = time.time()\n",
" average_shortest_path_length = average_shortest_path(graph)\n",
" end = time.time()\n",
"\n",
" print(\"\\tAverage shortest path length: {}\".format(round(average_shortest_path_length,2)))\n",
" print(\"\\tCPU time: \" + str(round(end-start,1)) + \" seconds\")\n",
" \n",
" analysis_results.loc[analysis_results['Graph'] == graph.name, 'Average Shortest Path Length'] = average_shortest_path_length\n",
"\n",
"analysis_results"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Betweenness Centrality\n",
"\n",
"The importance of a node in a network depends on many factors. A website may be important due to its content, a router due to its capacity. Of course, all of these properties depend on the nature\n",
"of the studied network, and may have very little to do with the graph structure of the network. We are particularly interested in the importance of a node (or a link) due to its topological function in the network. It is reasonable to assume that the topology of a network may dictate some intrinsic importance for different nodes. One measure of centrality can be the degree of a\n",
"node. The higher the degree, the more the node is connected, and therefore, the higher is its centrality in the network. However, the degree is not the only factor determining a node's importance \n",
"\n",
"One of the most accepted definitions of centrality is based on counting paths going through a node. For each node, i, in the network, the number of “routing” paths to all other nodes (i.e., paths through which data flow) going through i is counted, and this number determines the centrality i. The most common selection is taking only\n",
"the shortest paths as the routing paths. This leads to the following definition: the _betweenness centrality_ of a node, i, equals the number of shortest paths between all pairs of nodes in the network going through it, i.e.,\n",
"\n",
"\\begin{equation} \n",
" g(i) = \\sum_{\\{ j,k \\}} g_i (j,k)\n",
"\\end{equation}\n",
"\n",
"where the notation $\\{j, k\\}$ stands for summing each pair once, ignoring the order, and $g_i(j, k)$ equals $1$ if the shortest path between nodes $j$ and $k$ passes through node $i$ and $0$ otherwise. In fact, in networks with no weight (i.e., where all edges have the same length), there might be more than one shortest path. In that case, it is common to take $g_i(j, k) = C_i(j,k)/C(j,k)$, where $C(j,k)$ is the number of shortest paths between $j$ and $k$, and $C_i(j,k)$ is the number of those going through $i$. \n",
"\n",
"> Several variations of this scheme exist, focusing, in particular, on how to count distinct shortest paths (if several shortest paths share some edges). These differences tend to have a very small statistical influence in random complex networks, where the number of short loops is small. Therefore, we will concentrate on the above definition. Another nuance is whether the source and destination are considered part of the shortest path.\n",
"\n",
"The usefulness of the betweenness centrality in identifying bottlenecks and important nodes in the network has led to applications in identifying communities in biological and social networks.\n",
"\n",
"--- \n",
"\n",
"Let's see how to compute this centrality measure on our networks. The networkx library has a function that computes the betweenness centrality of all nodes in a network. It is based on the algorithm proposed in the paper\n",
"\n",
"_- Ulrik Brandes, A Faster Algorithm for Betweenness Centrality, Journal of Mathematical Sociology, 25(2):163-177, 2001._\n",
"\n",
"Even if this is a very fast algorithm, it's node enough to run in a reasonable time on large networks. Using the same idea of the previous sections, we can take samplings of our original graph, obtaining an approximate results. Unfortunately, I observed that even with heavy sampling, the time required to run the algorithm is still very high. To avoid using even more heavier samplings (that would bias the results), I decided to use a different approach: parallelization!\n",
"\n",
"In the `utils` module I implemented a function called `betweenness_centrality_parallel`. The function takes as input\n",
"\n",
"- `G: networkx graph object`: the graph on which we want to compute the average shortest path length\n",
"- `processes : int (optional)` The number of processes to use for computation. If `None` (default), processes is set to 1 and the standard betweenness algorithm is used.\n",
"- `k: int (default=None)`: percentage of nodes to remove from the graph. If k is None, the average shortest path length of each connected component is computed using all the nodes of the connected component.\n",
"\n",
"> **Memory Note:** Do not use more then 6 process for big graphs, otherwise the memory will be full. Do it only if you have more at least 32 GB of RAM. For small graphs, you can use more processes.\n",
"\n",
"The implemented functions divide the network in chunks of nodes and compute their contribution to the betweenness centrality of the whole network. Each chunk is computed in parallel, and the results are summed up to obtain the final result. The function returns a dictionary with the betweenness centrality of each node. For more information, see the function code in the `utils` module.\n",
"\n",
"Depending on the machine and the time available, we can choose different values for `k`. Lower values will give us a more precise result, but will take longer to compute. On the other hand, higher values will give us a less precise result, but will be faster to compute. I suggest to use `k=0.6` to test very quickly the function, and at least `k=0.2` to get a more precise result."
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Computing the approximate betweenness centrality for the Brightkite Checkins Graph...\n",
"\tNumber of nodes after removing 50.0% of nodes: 3247\n",
"\tNumber of edges after removing 50.0% of nodes: 73359\n",
"\tBetweenness centrality: 0.0005341953776334369 \n",
"\tCPU time: 34.5 seconds\n",
"\n",
"Computing the approximate betweenness centrality for the Gowalla Checkins Graph...\n",
"\tNumber of nodes after removing 50.0% of nodes: 1537\n",
"\tNumber of edges after removing 50.0% of nodes: 15424\n",
"\tBetweenness centrality: 0.001276770664894951 \n",
"\tCPU time: 5.5 seconds\n",
"\n",
"Computing the approximate betweenness centrality for the Foursquare Checkins Graph...\n",
"\tNumber of nodes after removing 50.0% of nodes: 1162\n",
"\tNumber of edges after removing 50.0% of nodes: 66899\n",
"\tBetweenness centrality: 0.0009378382408594679 \n",
"\tCPU time: 11.5 seconds\n",
"\n",
"Computing the approximate betweenness centrality for the Brightkite Friendship Graph...\n",
"\tNumber of nodes after removing 50.0% of nodes: 2710\n",
"\tNumber of edges after removing 50.0% of nodes: 3777\n",
"\tBetweenness centrality: 0.0006636813228671013 \n",
"\tCPU time: 4.5 seconds\n",
"\n",
"Computing the approximate betweenness centrality for the (Filtered) Gowalla Friendship Graph...\n",
"\tNumber of nodes after removing 50.0% of nodes: 1147\n",
"\tNumber of edges after removing 50.0% of nodes: 1529\n",
"\tBetweenness centrality: 0.0013313378500865271 \n",
"\tCPU time: 0.9 seconds\n",
"\n",
"Computing the approximate betweenness centrality for the Foursquare Friendship Graph...\n",
"\tNumber of nodes after removing 50.0% of nodes: 699\n",
"\tNumber of edges after removing 50.0% of nodes: 1400\n",
"\tBetweenness centrality: 0.0015311069213178477 \n",
"\tCPU time: 0.6 seconds\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Graph</th>\n",
" <th>Number of Nodes</th>\n",
" <th>Number of Edges</th>\n",
" <th>Average Degree</th>\n",
" <th>Average Clustering Coefficient</th>\n",
" <th>log N</th>\n",
" <th>Average Shortest Path Length</th>\n",
" <th>betweenness centrality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Brightkite Checkins Graph</td>\n",
" <td>6493</td>\n",
" <td>292973</td>\n",
" <td>90.242723</td>\n",
" <td>0.713999</td>\n",
" <td>8.778480</td>\n",
" <td>3.013369</td>\n",
" <td>0.000534</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Gowalla Checkins Graph</td>\n",
" <td>3073</td>\n",
" <td>62790</td>\n",
" <td>40.865604</td>\n",
" <td>0.548372</td>\n",
" <td>8.030410</td>\n",
" <td>3.508031</td>\n",
" <td>0.001277</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Foursquare Checkins Graph</td>\n",
" <td>2324</td>\n",
" <td>246702</td>\n",
" <td>212.30809</td>\n",
" <td>0.65273</td>\n",
" <td>7.751045</td>\n",
" <td>2.186112</td>\n",
" <td>0.000938</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Brightkite Friendship Graph</td>\n",
" <td>5420</td>\n",
" <td>14690</td>\n",
" <td>5.420664</td>\n",
" <td>0.218571</td>\n",
" <td>8.597851</td>\n",
" <td>5.231807</td>\n",
" <td>0.000664</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>(Filtered) Gowalla Friendship Graph</td>\n",
" <td>2294</td>\n",
" <td>5548</td>\n",
" <td>4.836966</td>\n",
" <td>0.234293</td>\n",
" <td>7.738052</td>\n",
" <td>5.396488</td>\n",
" <td>0.001331</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Foursquare Friendship Graph</td>\n",
" <td>1397</td>\n",
" <td>5323</td>\n",
" <td>7.620616</td>\n",
" <td>0.183485</td>\n",
" <td>7.242082</td>\n",
" <td>6.45841</td>\n",
" <td>0.001531</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Graph Number of Nodes Number of Edges \\\n",
"0 Brightkite Checkins Graph 6493 292973 \n",
"1 Gowalla Checkins Graph 3073 62790 \n",
"2 Foursquare Checkins Graph 2324 246702 \n",
"3 Brightkite Friendship Graph 5420 14690 \n",
"4 (Filtered) Gowalla Friendship Graph 2294 5548 \n",
"5 Foursquare Friendship Graph 1397 5323 \n",
"\n",
" Average Degree Average Clustering Coefficient log N \\\n",
"0 90.242723 0.713999 8.778480 \n",
"1 40.865604 0.548372 8.030410 \n",
"2 212.30809 0.65273 7.751045 \n",
"3 5.420664 0.218571 8.597851 \n",
"4 4.836966 0.234293 7.738052 \n",
"5 7.620616 0.183485 7.242082 \n",
"\n",
" Average Shortest Path Length betweenness centrality \n",
"0 3.013369 0.000534 \n",
"1 3.508031 0.001277 \n",
"2 2.186112 0.000938 \n",
"3 5.231807 0.000664 \n",
"4 5.396488 0.001331 \n",
"5 6.45841 0.001531 "
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for graph in graphs_all:\n",
" print(\"\\nComputing the approximate betweenness centrality for the {}...\".format(graph.name))\n",
" start = time.time()\n",
" betweenness_centrality = np.mean(list(betweenness_centrality_parallel(graph, 4, k = 0.5).values()))\n",
" end = time.time()\n",
" print(\"\\tBetweenness centrality: {} \".format(betweenness_centrality))\n",
" print(\"\\tCPU time: \" + str(round(end-start,1)) + \" seconds\")\n",
"\n",
" analysis_results.loc[analysis_results['Graph'] == graph.name, 'betweenness centrality'] = betweenness_centrality\n",
"\n",
"analysis_results.to_pickle('analysis_results.pkl')\n",
"analysis_results"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download the dataframe with accurate results\n",
"\n",
"I run all the measures above on a server with as less sampling as possible. The computation took a lot of time, but are inevitably more precise. From now on, we will refer to those results"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"analysis_results = pd.read_pickle('analysis_results.pkl')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analysis of the results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Distribution of Degree\n",
"\n",
"\n",
"The Erdős-Rényi model has traditionally been the dominant subject of study in the field of random graphs. Recently, however, several studies of real-world networks have found that the ER model fails to reproduce many of their observed properties. One of the simplest properties of a network that can be measured directly is the degree distribution, or the fraction $P(k)$ of nodes having k connections (degree $k$). A well-known result for ER networks is that the degree distribution is Poissonian,\n",
"\n",
"\\begin{equation}\n",
" P(k) = \\frac{e^{z} z^k}{k!}\n",
"\\end{equation}\n",
"\n",
"Where $z = \\langle k \\rangle$. is the average degree. Direct measurements of the degree distribution for real networks show that the Poisson law does not apply. Rather, often these nets exhibit a scale-free degree distribution:\n",
"\n",
"\\begin{equation}\n",
" P(k) = ck^{-\\gamma} \\quad \\text{for} \\quad k = m, ... , K\n",
"\\end{equation}\n",
"\n",
"Where $c \\sim (\\gamma -1)m^{\\gamma - 1}$ is a normalization factor, and $m$ and $K$ are the lower and upper cutoffs for the degree of a node, respectively. The divergence of moments higher then $\\lceil \\gamma -1 \\rceil$ (as $K \\to \\infty$ when $N \\to \\infty$) is responsible for many of the anomalous properties attributed to scale-free networks. \n",
"\n",
"All real-world networks are finite and therefore all their moments are finite. The actual value of the cutoff K plays an important role. It may be approximated by noting that the total probability of nodes with $k > K$ is of order $1/N$\n",
"\n",
"\\begin{equation}\n",
" \\int_K^\\infty P(k) dk \\sim \\frac{1}{N}\n",
"\\end{equation}\n",
"\n",
"This yields the result\n",
"\n",
"\\begin{equation}\n",
" K \\sim m N^{1/(\\gamma -1)}\n",
"\\end{equation}\n",
"\n",
"---\n",
"\n",
"Let's see if our networks are scale-free or not. We can use the `degree_distribution` function from the `utils` module to plot the degree distribution of a graph. It takes a networkx graph object as input and returns a plot of the degree distribution. We expect to see a power-law distribution and not a Poissonian one."
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
6,
234,
29,
237,
97,
255,
262,
72,
309,
1,
37,
17,
539,
99,
10,
92,
52,
815,
649,
2,
140,
490,
231,
101,
68,
94,
267,
66,
239,
354,
41,
226,
657,
232,
5,
540,
35,
14,
483,
16,
506,
26,
7,
840,
450,
47,
578,
58,
359,
268,
95,
275,
565,
59,
67,
8,
150,
39,
655,
177,
108,
28,
128,
181,
156,
148,
236,
250,
383,
257,
297,
260,
50,
193,
201,
40,
13,
80,
18,
345,
43,
104,
584,
102,
12,
31,
138,
867,
48,
55,
243,
317,
85,
79,
259,
76,
130,
246,
238,
282,
286,
266,
526,
715,
3,
45,
241,
89,
264,
112,
42,
168,
184,
20,
251,
34,
466,
117,
208,
82,
122,
252,
303,
22,
535,
179,
180,
153,
69,
114,
25,
746,
224,
205,
149,
53,
475,
195,
84,
11,
24,
135,
194,
103,
71,
487,
1021,
167,
220,
4,
129,
107,
508,
36,
788,
21,
9,
162,
30,
254,
273,
216,
227,
113,
467,
543,
51,
32,
755,
93,
81,
621,
320,
27,
548,
49,
222,
908,
219,
456,
209,
54,
60,
15,
170,
326,
44,
160,
96,
272,
707,
261,
496,
477,
106,
174,
61,
19,
57,
90,
115,
38,
73,
23,
189,
462,
221,
214,
469,
215,
62,
109,
294,
310,
468,
91,
296,
290,
46,
166,
199,
64,
98,
88,
906,
786,
87,
176,
517,
182,
536,
77,
498,
242,
663,
528,
853,
33,
280,
318,
151,
116,
1043,
110,
671,
803,
480,
678,
142,
738,
65,
493,
321,
716,
161,
126,
211,
218,
292,
187,
384,
674,
169,
198,
197,
131,
253,
591,
100,
229,
311,
541,
499,
173,
230,
1004,
245,
497,
63,
388,
305,
228,
278,
500,
491,
127,
723,
365,
74,
136,
247,
240,
476,
516,
159,
223,
277,
213,
56,
553,
790,
86,
323,
225,
178,
433,
378,
593,
212,
465,
478,
601,
206,
728,
346,
463,
550,
461,
171,
146,
158,
83,
400,
521,
651,
319,
335,
349,
200,
484,
334,
333,
549,
165,
75,
256,
387,
118,
376,
698,
314,
472,
546,
291,
190,
430,
269,
350,
492,
734,
298,
572,
185,
157,
357,
330,
235,
353,
186,
137,
233,
471,
258,
105,
504,
154,
360,
70,
544,
473,
217,
510,
817,
192,
495,
139,
464,
204,
434,
145,
327,
444,
155,
210,
666,
645,
397,
502,
556,
152,
639,
133,
124,
603,
313,
271,
460,
249,
527,
248,
445,
665,
777,
518,
656,
615,
367,
596,
279,
511,
316,
325,
183,
175,
163,
605,
144,
675,
481,
567,
344,
301,
524,
111,
704,
302,
689,
505,
332,
147,
512,
509,
585,
486,
660,
479,
538,
580,
501,
552,
520,
577,
525,
751,
534,
134,
607,
545,
561,
485,
503,
470,
547,
514,
778,
630,
770,
188,
125,
494,
482,
610,
123,
574,
489,
141,
120,
616,
531,
837,
754,
488,
731,
783,
699,
513,
474,
306,
265,
274,
191,
372,
389,
368,
270,
132,
415,
393,
288,
244,
121
],
"y": [
166,
9,
20,
4,
12,
7,
4,
6,
1,
619,
32,
59,
2,
13,
120,
13,
37,
1,
1,
333,
3,
2,
8,
5,
17,
11,
3,
18,
9,
1,
19,
13,
1,
6,
178,
4,
31,
88,
4,
62,
4,
53,
146,
1,
1,
44,
1,
14,
1,
3,
10,
2,
1,
31,
36,
116,
38,
27,
2,
1,
5,
40,
2,
4,
9,
7,
14,
1,
1,
1,
3,
2,
41,
1,
1,
32,
100,
9,
88,
2,
10,
5,
2,
8,
90,
32,
4,
1,
21,
44,
3,
1,
5,
40,
2,
8,
5,
3,
5,
2,
2,
3,
3,
1,
231,
12,
6,
62,
2,
5,
17,
1,
1,
83,
2,
46,
9,
7,
2,
13,
4,
2,
4,
48,
3,
3,
3,
8,
15,
6,
45,
2,
15,
1,
5,
63,
3,
3,
9,
104,
39,
2,
1,
11,
10,
3,
1,
3,
15,
220,
5,
6,
1,
44,
1,
49,
122,
2,
36,
2,
1,
5,
17,
4,
6,
1,
11,
34,
2,
8,
9,
1,
1,
37,
4,
23,
29,
1,
20,
1,
2,
25,
16,
61,
3,
1,
33,
4,
9,
3,
1,
4,
2,
4,
4,
2,
20,
64,
12,
26,
2,
38,
7,
47,
3,
6,
65,
9,
7,
9,
10,
6,
1,
1,
6,
18,
2,
1,
50,
5,
2,
17,
9,
17,
1,
1,
36,
3,
3,
2,
1,
11,
2,
6,
2,
2,
1,
18,
1,
2,
12,
4,
1,
5,
2,
1,
2,
1,
2,
1,
8,
3,
1,
1,
4,
2,
63,
80,
2,
3,
1,
1,
3,
1,
3,
1,
2,
1,
4,
10,
1,
2,
2,
1,
4,
1,
4,
7,
12,
1,
1,
12,
2,
1,
3,
3,
1,
1,
8,
3,
3,
5,
4,
2,
3,
12,
2,
12,
35,
1,
1,
24,
1,
12,
5,
1,
1,
1,
15,
6,
5,
1,
1,
1,
3,
11,
1,
14,
2,
7,
5,
6,
2,
2,
5,
2,
1,
1,
2,
4,
3,
1,
1,
5,
10,
4,
1,
4,
1,
1,
1,
3,
1,
2,
3,
2,
2,
1,
2,
1,
1,
1,
1,
2,
1,
1,
5,
1,
1,
1,
6,
4,
2,
6,
2,
5,
1,
5,
2,
6,
2,
1,
1,
2,
1,
3,
11,
2,
1,
4,
1,
1,
5,
1,
1,
1,
1,
2,
2,
6,
1,
3,
4,
1,
1,
1,
144,
2,
1,
1,
1,
1,
1,
1,
1,
1,
2,
1,
2,
1,
1,
1,
2,
2,
3,
1,
31,
3,
1,
1,
1,
1,
1,
4,
1,
1,
1,
1,
3,
5,
5,
1,
1,
2,
1,
5,
2,
1,
1,
1,
2,
1,
2,
1,
1,
2,
1,
1,
1,
5,
1,
5,
1,
1,
1,
1,
1,
2,
1,
1,
3,
1,
1,
1,
1,
4,
1,
1,
2,
1,
1,
2,
1,
1,
1,
1,
1,
1,
1,
2,
1,
1,
1,
1,
1,
1,
1,
1,
2,
1,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution (log-log scale) of Brightkite Checkins Graph"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
},
"type": "log"
},
"yaxis": {
"title": {
"text": "Number of Nodes"
},
"type": "log"
}
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
26,
208,
59,
81,
21,
146,
37,
50,
4,
5,
3,
77,
117,
35,
28,
78,
109,
76,
9,
60,
186,
291,
444,
8,
118,
42,
54,
39,
90,
134,
61,
55,
63,
41,
31,
16,
92,
66,
91,
127,
84,
303,
124,
12,
103,
36,
69,
53,
83,
30,
49,
40,
33,
67,
119,
137,
231,
29,
17,
6,
43,
122,
94,
14,
281,
93,
115,
98,
145,
19,
47,
7,
62,
38,
20,
24,
13,
120,
89,
34,
48,
184,
2,
409,
73,
58,
157,
222,
101,
25,
57,
105,
74,
112,
18,
171,
32,
153,
159,
88,
51,
149,
52,
46,
106,
82,
70,
202,
1,
71,
22,
75,
218,
44,
100,
27,
107,
158,
179,
110,
181,
131,
264,
333,
97,
199,
300,
252,
474,
79,
545,
15,
45,
271,
212,
23,
10,
56,
87,
64,
68,
244,
237,
11,
152,
129,
148,
160,
204,
245,
80,
111,
196,
128,
419,
170,
501,
379,
164,
200,
280,
256,
368,
325,
235,
243,
95,
448,
180,
114,
143,
390,
309,
85,
302,
491,
133,
326,
147,
177,
255,
205,
217,
126,
210,
113,
201,
345,
266,
108,
136,
99,
194,
162,
139,
316,
292,
173,
65,
225,
240,
72,
242,
161,
166,
175,
192,
144,
156,
183,
198,
123,
168,
211,
323,
265,
301,
174,
287,
176,
132,
172,
203,
154,
277,
283,
282,
214,
353,
197,
190,
221,
125,
261,
130,
167,
234,
439,
188,
238,
219,
315,
272,
236,
337,
278,
193,
253,
305,
151,
165,
140,
273,
288,
215,
96,
207,
86,
187,
104,
182,
150,
213,
121,
268,
227
],
"y": [
22,
2,
10,
5,
36,
3,
22,
21,
138,
91,
186,
5,
5,
19,
34,
7,
2,
11,
78,
7,
2,
3,
2,
66,
1,
23,
10,
18,
7,
5,
10,
9,
9,
23,
27,
49,
6,
13,
3,
7,
8,
2,
3,
42,
4,
22,
12,
4,
8,
29,
16,
27,
19,
10,
4,
7,
3,
28,
33,
79,
17,
1,
6,
30,
1,
3,
4,
6,
1,
30,
11,
68,
7,
21,
37,
26,
46,
6,
5,
15,
18,
2,
218,
2,
14,
12,
5,
2,
5,
25,
14,
5,
9,
3,
31,
4,
22,
1,
3,
2,
12,
2,
6,
14,
1,
6,
7,
3,
358,
10,
22,
3,
3,
16,
5,
23,
3,
4,
4,
3,
2,
3,
3,
3,
2,
2,
1,
2,
1,
3,
1,
40,
8,
1,
2,
28,
49,
11,
3,
6,
6,
2,
2,
45,
4,
2,
3,
4,
1,
2,
9,
4,
1,
2,
1,
4,
1,
1,
3,
1,
1,
2,
1,
1,
2,
3,
4,
1,
3,
4,
2,
1,
3,
3,
1,
1,
2,
1,
2,
1,
1,
2,
1,
4,
1,
3,
2,
1,
3,
6,
2,
2,
3,
3,
5,
1,
1,
3,
3,
2,
2,
9,
1,
2,
2,
1,
2,
3,
2,
2,
1,
3,
2,
1,
1,
1,
1,
1,
2,
1,
2,
1,
2,
2,
2,
1,
2,
3,
2,
4,
1,
1,
5,
1,
3,
2,
2,
1,
2,
2,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
2,
1,
1,
1,
1,
1,
1,
3,
1,
3,
1,
2,
1,
4,
1,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution (log-log scale) of Gowalla Checkins Graph"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
},
"type": "log"
},
"yaxis": {
"title": {
"text": "Number of Nodes"
},
"type": "log"
}
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
33,
498,
158,
143,
321,
376,
203,
349,
286,
329,
479,
710,
76,
422,
289,
600,
271,
793,
469,
110,
333,
1038,
434,
371,
729,
556,
276,
302,
728,
477,
771,
320,
700,
277,
461,
507,
548,
283,
493,
311,
294,
444,
623,
673,
468,
545,
879,
502,
278,
492,
598,
282,
524,
578,
735,
394,
449,
759,
308,
388,
497,
722,
476,
737,
421,
315,
518,
855,
1061,
313,
335,
392,
620,
454,
280,
828,
406,
432,
381,
383,
856,
544,
327,
696,
512,
564,
312,
346,
674,
792,
796,
521,
474,
443,
466,
370,
712,
859,
368,
340,
970,
395,
606,
316,
854,
345,
689,
336,
781,
279,
553,
360,
378,
486,
653,
331,
420,
314,
902,
829,
299,
632,
403,
355,
292,
484,
570,
281,
920,
409,
325,
412,
431,
428,
347,
440,
773,
318,
941,
705,
823,
485,
552,
363,
362,
652,
319,
875,
451,
483,
359,
944,
310,
585,
684,
339,
334,
495,
642,
893,
305,
562,
467,
780,
622,
789,
560,
456,
288,
455,
487,
337,
442,
536,
969,
373,
344,
768,
482,
567,
624,
803,
917,
475,
905,
857,
1004,
534,
372,
390,
436,
908,
743,
284,
457,
450,
365,
341,
832,
821,
549,
509,
307,
1,
898,
13,
6,
74,
44,
825,
160,
234,
489,
625,
22,
60,
40,
86,
88,
87,
15,
3,
49,
12,
211,
539,
129,
244,
532,
317,
107,
580,
223,
348,
242,
34,
119,
441,
55,
42,
257,
128,
437,
256,
471,
132,
207,
121,
84,
850,
62,
293,
609,
576,
295,
361,
571,
701,
516,
222,
7,
100,
488,
915,
351,
228,
894,
382,
332,
739,
59,
61,
117,
185,
112,
718,
125,
81,
188,
255,
108,
174,
526,
189,
235,
910,
249,
517,
241,
209,
681,
219,
164,
72,
167,
138,
176,
246,
155,
480,
73,
531,
473,
470,
868,
396,
306,
247,
672,
530,
205,
196,
152,
142,
141,
175,
644,
101,
503,
229,
140,
811,
206,
105,
104,
106,
102,
187,
697,
27,
71,
154,
550,
691,
814,
192,
91,
546,
41,
157,
591,
21,
231,
146,
172,
830,
387,
841,
411,
555,
145,
551,
758,
791,
197,
405,
183,
410,
178,
151,
890,
221,
193,
232,
191,
268,
245,
520,
889,
43,
28,
201,
379,
777,
285,
45,
262,
10,
204,
19,
755,
23,
2,
137,
500,
510,
9,
522,
275,
572,
8,
179,
4,
790,
540,
590,
243,
711,
265,
264,
24,
356,
58,
224,
70,
490,
304,
236,
272,
618,
523,
838,
658,
296,
404,
592,
880,
621,
583,
513,
662,
809,
690,
588,
506,
352,
640,
324,
757,
251,
238,
706,
651,
237,
797,
646,
51,
358,
426,
226,
323,
648,
364,
163,
90,
225,
17,
326,
166,
584,
96,
433,
139,
367,
80,
230,
5,
529,
111,
679,
66,
423,
64,
239,
472,
89,
369,
415,
462,
547,
525,
79,
375,
582,
414,
543,
274,
478,
425,
126,
602,
258,
671,
180,
511,
148,
505,
416,
393,
669,
350,
215,
504,
213,
36,
398,
165,
427,
131,
26,
494,
465,
508,
665,
377,
50,
254,
32,
83,
16,
94,
135,
54,
212,
537,
724,
198,
39,
557,
48,
184,
496,
692,
113,
635,
565,
68,
639,
63,
273,
47,
56,
452,
69,
218,
136,
46,
723,
200,
354,
569,
541,
297,
14,
438,
248,
447,
194,
181,
663,
173,
617,
408,
171,
208,
252,
435,
153,
267,
661,
134,
501,
400,
633,
563,
353,
118,
605,
380,
424,
568,
448,
499,
328,
481,
417,
491,
391,
459,
343,
342,
397,
594,
389,
558,
384,
147,
168,
182,
270,
322,
120,
53,
233,
97,
144,
210,
300,
161,
85,
123,
77,
150,
130,
214,
199,
149,
95,
162,
57,
216,
115,
92,
29,
114,
202,
38,
122,
301,
169,
303,
67,
98,
99,
177,
103,
30,
18,
338,
31,
78,
253,
266,
269,
260,
330,
259,
250,
93,
116,
82,
25,
11,
170,
156,
65,
195,
35,
20,
37,
75,
109,
52
],
"y": [
7,
2,
4,
14,
7,
3,
6,
3,
6,
5,
2,
2,
4,
4,
4,
1,
1,
1,
3,
1,
2,
1,
1,
2,
1,
2,
33,
4,
1,
3,
1,
4,
1,
6,
3,
3,
1,
5,
2,
3,
2,
3,
1,
1,
1,
3,
1,
3,
5,
4,
3,
3,
3,
2,
2,
2,
2,
1,
4,
3,
1,
1,
2,
1,
2,
3,
1,
1,
1,
5,
2,
4,
4,
2,
3,
1,
1,
4,
5,
3,
2,
3,
4,
1,
1,
4,
2,
4,
2,
1,
1,
5,
2,
3,
4,
5,
1,
1,
1,
2,
1,
4,
1,
2,
1,
3,
1,
2,
2,
4,
1,
2,
4,
2,
2,
4,
1,
2,
1,
1,
4,
1,
1,
1,
3,
4,
1,
9,
1,
4,
5,
2,
3,
1,
5,
4,
1,
2,
1,
1,
1,
2,
2,
4,
3,
2,
1,
1,
3,
2,
2,
1,
1,
1,
1,
3,
2,
3,
1,
1,
2,
2,
4,
1,
1,
2,
1,
3,
2,
2,
1,
1,
3,
3,
1,
1,
3,
1,
3,
1,
3,
1,
1,
3,
1,
1,
1,
5,
1,
1,
3,
1,
2,
3,
2,
4,
2,
2,
1,
1,
1,
3,
4,
116,
1,
13,
28,
10,
12,
1,
3,
3,
3,
1,
12,
3,
7,
8,
5,
4,
12,
47,
6,
13,
3,
2,
4,
1,
1,
3,
3,
1,
6,
2,
2,
23,
5,
2,
6,
10,
2,
4,
3,
2,
2,
3,
2,
3,
5,
1,
13,
2,
2,
2,
4,
3,
2,
1,
4,
1,
17,
3,
2,
1,
3,
3,
1,
1,
1,
1,
7,
7,
3,
2,
2,
1,
2,
9,
3,
1,
4,
2,
4,
4,
2,
1,
1,
1,
3,
3,
2,
3,
5,
11,
1,
1,
1,
5,
3,
2,
5,
2,
2,
2,
1,
3,
3,
2,
1,
2,
5,
8,
8,
6,
13,
3,
3,
5,
2,
1,
2,
1,
3,
4,
3,
6,
4,
5,
1,
8,
9,
4,
2,
1,
1,
5,
5,
2,
8,
2,
1,
5,
2,
5,
1,
1,
2,
1,
2,
3,
3,
1,
2,
1,
5,
1,
1,
2,
4,
4,
1,
4,
1,
3,
4,
3,
3,
2,
1,
15,
7,
2,
1,
1,
2,
5,
2,
20,
3,
17,
1,
7,
60,
2,
3,
2,
20,
1,
3,
2,
13,
3,
24,
1,
1,
1,
2,
1,
4,
3,
9,
1,
6,
3,
7,
3,
3,
8,
4,
1,
2,
1,
2,
4,
2,
1,
1,
1,
1,
3,
1,
1,
2,
1,
1,
2,
1,
5,
1,
1,
2,
2,
1,
2,
1,
1,
2,
1,
1,
3,
1,
1,
3,
1,
2,
3,
11,
1,
13,
1,
6,
4,
3,
3,
12,
3,
20,
2,
7,
2,
3,
4,
8,
2,
2,
6,
3,
2,
2,
2,
1,
3,
3,
1,
1,
1,
2,
1,
2,
4,
2,
1,
1,
3,
2,
1,
1,
1,
2,
1,
1,
2,
2,
1,
4,
1,
19,
1,
1,
4,
1,
2,
1,
1,
1,
1,
1,
12,
5,
12,
1,
8,
7,
2,
1,
1,
4,
4,
1,
8,
3,
1,
1,
3,
1,
1,
5,
1,
5,
3,
8,
6,
2,
7,
1,
3,
19,
1,
2,
1,
1,
1,
2,
14,
1,
1,
3,
2,
1,
1,
3,
1,
1,
1,
4,
2,
1,
1,
1,
1,
4,
3,
1,
1,
2,
3,
2,
1,
1,
1,
1,
1,
4,
1,
2,
2,
1,
2,
1,
2,
1,
1,
1,
1,
1,
1,
3,
2,
3,
2,
1,
3,
37,
2,
2,
3,
1,
2,
1,
5,
2,
2,
2,
2,
1,
1,
2,
5,
1,
7,
3,
2,
3,
17,
5,
2,
8,
2,
1,
4,
1,
7,
2,
4,
2,
3,
8,
6,
1,
4,
2,
1,
1,
2,
1,
1,
1,
1,
5,
2,
5,
8,
9,
2,
1,
2,
1,
4,
12,
6,
2,
1,
2
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution (log-log scale) of Foursquare Checkins Graph"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
},
"type": "log"
},
"yaxis": {
"title": {
"text": "Number of Nodes"
},
"type": "log"
}
}
}
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for G in checkins_graphs:\n",
" degree_distribution(G)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
36,
17,
7,
12,
42,
23,
15,
6,
1,
18,
9,
21,
22,
5,
8,
226,
3,
46,
10,
11,
2,
27,
14,
28,
31,
58,
30,
19,
41,
35,
20,
75,
43,
40,
74,
4,
33,
26,
16,
39,
13,
50,
59,
25,
24,
67,
82,
103,
202,
55,
49,
34,
29,
45,
87,
56,
93,
32,
44,
65,
63,
89,
60,
70,
47,
78,
127,
48,
61,
54,
72,
38,
53,
52,
97,
64,
62,
37,
51
],
"y": [
4,
35,
164,
64,
5,
12,
37,
193,
1908,
29,
111,
17,
12,
298,
122,
1,
521,
5,
93,
68,
999,
8,
42,
14,
6,
3,
4,
24,
9,
7,
21,
2,
2,
4,
3,
351,
9,
14,
34,
9,
50,
1,
3,
13,
13,
1,
2,
1,
1,
3,
2,
5,
6,
5,
1,
3,
1,
7,
2,
1,
2,
1,
1,
3,
1,
1,
1,
3,
1,
1,
1,
8,
1,
1,
1,
1,
1,
3,
3
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution (log-log scale) of Brightkite Friendship Graph"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
},
"type": "log"
},
"yaxis": {
"title": {
"text": "Number of Nodes"
},
"type": "log"
}
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
28,
40,
10,
5,
64,
34,
46,
16,
1,
24,
23,
17,
12,
7,
4,
19,
32,
9,
3,
2,
15,
6,
45,
38,
26,
11,
35,
25,
14,
8,
36,
49,
39,
27,
72,
43,
44,
13,
58,
21,
18,
20,
22,
48,
54,
51,
29,
33,
30,
31,
42,
104,
55,
37,
47
],
"y": [
10,
1,
35,
113,
1,
2,
3,
16,
731,
5,
3,
13,
24,
91,
194,
11,
6,
41,
274,
432,
13,
97,
2,
2,
6,
24,
2,
8,
9,
55,
2,
1,
1,
4,
1,
2,
1,
24,
1,
2,
5,
5,
8,
1,
1,
1,
2,
1,
1,
1,
1,
1,
1,
1,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution (log-log scale) of (Filtered) Gowalla Friendship Graph"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
},
"type": "log"
},
"yaxis": {
"title": {
"text": "Number of Nodes"
},
"type": "log"
}
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
10,
13,
9,
30,
14,
25,
45,
18,
56,
3,
8,
12,
37,
66,
1,
23,
36,
40,
4,
5,
6,
7,
33,
42,
15,
19,
27,
31,
49,
11,
63,
22,
24,
28,
46,
2,
32,
149,
58,
17,
54,
16,
48,
21,
20,
29,
59,
26,
38,
51,
44,
47,
43,
52,
39,
53,
34,
41,
73,
57,
35
],
"y": [
30,
19,
28,
2,
18,
11,
3,
8,
3,
130,
37,
17,
5,
1,
402,
12,
7,
2,
94,
69,
50,
43,
2,
3,
21,
8,
6,
5,
1,
24,
1,
8,
5,
8,
2,
217,
5,
1,
1,
15,
2,
10,
2,
6,
9,
8,
1,
7,
2,
1,
1,
1,
2,
1,
6,
1,
5,
4,
1,
1,
2
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution (log-log scale) of Foursquare Friendship Graph"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
},
"type": "log"
},
"yaxis": {
"title": {
"text": "Number of Nodes"
},
"type": "log"
}
}
}
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for graph in friendships_graph:\n",
" degree_distribution(graph)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"As we can clearly see from the graphs obtained, the degree distribution of the networks is not Poissonian, but rather scale-free. This is a good indication that the networks are not random, but rather small-world.\n",
"\n",
"Let's try to plot the distribution degree of a random Watts-Strogatz graph with the same number of nodes and a probability of edge creation equal to the number of edges of the network divided by the number of possible edges. We expect to see a Poissonian distribution.\n",
"\n",
"> This is a time saving approach, NOT a rigorous one. If we want to be rigorous, should follow the algorithm proposed by Maslov and Sneppen, implemented in the the networkx function `random_reference`."
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Brightkite Checkins Graph Watts-Strogatz\n",
"Number of nodes: 2324\n",
"Number of edges: 246344\n"
]
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
207,
208,
186,
192,
220,
209,
228,
182,
212,
217,
206,
221,
211,
200,
204,
227,
215,
226,
231,
216,
194,
232,
201,
219,
214,
197,
202,
199,
196,
230,
205,
203,
223,
218,
222,
224,
210,
198,
195,
213,
237,
239,
247,
193,
233,
234,
225,
190,
235,
236,
185,
229,
240,
238,
188,
191,
189,
241,
180,
242,
183,
245
],
"y": [
98,
84,
2,
10,
64,
81,
29,
3,
89,
79,
78,
66,
85,
52,
77,
34,
86,
35,
18,
78,
22,
13,
58,
69,
76,
30,
57,
48,
25,
21,
77,
63,
55,
70,
46,
42,
94,
44,
16,
94,
2,
3,
1,
13,
16,
11,
36,
8,
10,
10,
6,
10,
3,
5,
2,
7,
5,
3,
1,
1,
2,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution of Brightkite Checkins Graph Watts-Strogatz"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
}
},
"yaxis": {
"title": {
"text": "Number of Nodes"
}
}
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Gowalla Checkins Graph Watts-Strogatz\n",
"Number of nodes: 2324\n",
"Number of edges: 246344\n"
]
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
208,
222,
211,
207,
199,
201,
229,
231,
197,
219,
204,
206,
209,
203,
234,
216,
210,
215,
223,
213,
240,
217,
221,
205,
227,
202,
212,
224,
214,
196,
228,
218,
200,
189,
225,
188,
194,
220,
226,
232,
198,
193,
235,
230,
237,
183,
195,
187,
192,
191,
236,
185,
233,
182,
238,
242,
186,
184,
190,
239,
243,
244
],
"y": [
84,
52,
83,
97,
41,
52,
24,
20,
32,
76,
82,
79,
75,
65,
9,
80,
81,
90,
52,
88,
6,
67,
60,
84,
29,
62,
83,
49,
83,
25,
18,
88,
48,
6,
40,
7,
18,
59,
37,
16,
46,
18,
10,
13,
8,
3,
20,
6,
6,
8,
8,
1,
7,
2,
5,
3,
2,
1,
5,
3,
1,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution of Gowalla Checkins Graph Watts-Strogatz"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
}
},
"yaxis": {
"title": {
"text": "Number of Nodes"
}
}
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Foursquare Checkins Graph Watts-Strogatz\n",
"Number of nodes: 2324\n",
"Number of edges: 246344\n"
]
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
223,
208,
207,
212,
219,
202,
220,
209,
227,
218,
213,
211,
221,
200,
214,
222,
199,
210,
184,
195,
204,
229,
239,
203,
198,
193,
234,
216,
224,
197,
206,
191,
217,
230,
188,
205,
201,
248,
225,
192,
228,
181,
196,
183,
215,
233,
194,
231,
243,
226,
189,
235,
232,
190,
240,
241,
237,
187,
245,
238,
236,
250,
182
],
"y": [
39,
98,
84,
85,
89,
61,
76,
78,
29,
86,
80,
99,
61,
49,
87,
49,
43,
81,
3,
25,
79,
28,
7,
60,
49,
17,
10,
73,
33,
34,
68,
9,
73,
18,
7,
83,
45,
1,
49,
9,
29,
1,
22,
1,
74,
12,
15,
18,
3,
40,
8,
3,
11,
9,
4,
1,
5,
6,
2,
1,
3,
1,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution of Foursquare Checkins Graph Watts-Strogatz"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
}
},
"yaxis": {
"title": {
"text": "Number of Nodes"
}
}
}
}
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for graph in checkins_graphs:\n",
"\n",
" p = G.number_of_edges() / (G.number_of_nodes())\n",
" avg_degree = int(np.mean([d for n, d in G.degree()]))\n",
" G = nx.watts_strogatz_graph(G.number_of_nodes(), avg_degree, p)\n",
" G.name = graph.name + \" - Watts-Strogatz similarity\"\n",
"\n",
" print(G.name)\n",
" print(\"Number of nodes: \", G.number_of_nodes())\n",
" print(\"Number of edges: \", G.number_of_edges())\n",
" degree_distribution(G, log=False)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Brightkite Friendship Graph Watts-Strogatz\n",
"Number of nodes: 2324\n",
"Number of edges: 246344\n"
]
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
210,
214,
195,
208,
219,
201,
216,
202,
204,
217,
220,
221,
213,
193,
211,
209,
223,
228,
222,
206,
200,
233,
212,
242,
227,
205,
203,
207,
191,
231,
226,
224,
229,
198,
225,
199,
218,
239,
215,
196,
194,
197,
184,
183,
237,
230,
192,
232,
186,
234,
236,
240,
235,
185,
188,
189,
181,
246,
190,
179,
245,
241
],
"y": [
92,
88,
25,
85,
76,
77,
97,
54,
71,
76,
55,
61,
83,
13,
99,
86,
52,
21,
53,
96,
44,
14,
91,
1,
28,
88,
71,
69,
10,
23,
35,
50,
22,
31,
45,
38,
70,
1,
71,
18,
19,
25,
2,
1,
7,
15,
13,
11,
3,
12,
4,
2,
8,
1,
4,
8,
1,
1,
4,
1,
1,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution of Brightkite Friendship Graph Watts-Strogatz"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
}
},
"yaxis": {
"title": {
"text": "Number of Nodes"
}
}
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"(Filtered) Gowalla Friendship Graph Watts-Strogatz\n",
"Number of nodes: 2324\n",
"Number of edges: 246344\n"
]
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
197,
204,
229,
221,
217,
223,
218,
190,
213,
224,
206,
222,
216,
226,
228,
208,
220,
194,
207,
212,
205,
196,
219,
225,
209,
230,
201,
214,
200,
202,
199,
211,
193,
215,
191,
231,
210,
203,
227,
198,
186,
187,
235,
195,
236,
232,
238,
244,
192,
233,
240,
188,
189,
239,
248,
181,
234,
237,
242,
183,
184,
180,
254
],
"y": [
31,
60,
20,
53,
77,
42,
82,
8,
103,
37,
82,
66,
74,
34,
30,
79,
42,
21,
73,
90,
92,
29,
65,
41,
100,
23,
47,
93,
45,
66,
42,
84,
14,
88,
15,
27,
98,
59,
40,
31,
2,
5,
6,
28,
5,
12,
4,
1,
9,
9,
3,
5,
4,
4,
2,
1,
11,
3,
1,
3,
1,
1,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution of (Filtered) Gowalla Friendship Graph Watts-Strogatz"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
}
},
"yaxis": {
"title": {
"text": "Number of Nodes"
}
}
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Foursquare Friendship Graph Watts-Strogatz\n",
"Number of nodes: 2324\n",
"Number of edges: 246344\n"
]
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"name": "Degree Distribution",
"type": "bar",
"x": [
208,
228,
215,
217,
209,
204,
194,
220,
237,
198,
199,
211,
207,
202,
201,
190,
222,
182,
212,
210,
218,
206,
232,
238,
205,
221,
225,
213,
230,
216,
227,
233,
231,
203,
224,
223,
214,
226,
197,
191,
200,
229,
196,
219,
235,
241,
192,
236,
195,
186,
193,
181,
188,
245,
234,
243,
247,
189,
184,
185,
240,
183,
239,
187
],
"y": [
80,
24,
84,
67,
98,
69,
19,
68,
9,
38,
38,
95,
90,
50,
65,
10,
59,
2,
103,
92,
81,
107,
19,
4,
67,
48,
46,
86,
16,
68,
38,
13,
20,
69,
45,
40,
93,
23,
24,
8,
45,
23,
21,
67,
11,
2,
7,
6,
19,
6,
15,
2,
4,
1,
3,
1,
1,
8,
1,
1,
1,
1,
2,
1
]
}
],
"layout": {
"height": 600,
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "white",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "#C8D4E3",
"linecolor": "#C8D4E3",
"minorgridcolor": "#C8D4E3",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "white",
"showlakes": true,
"showland": true,
"subunitcolor": "#C8D4E3"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "white",
"polar": {
"angularaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
},
"bgcolor": "white",
"radialaxis": {
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"yaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
},
"zaxis": {
"backgroundcolor": "white",
"gridcolor": "#DFE8F3",
"gridwidth": 2,
"linecolor": "#EBF0F8",
"showbackground": true,
"ticks": "",
"zerolinecolor": "#EBF0F8"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"baxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
},
"bgcolor": "white",
"caxis": {
"gridcolor": "#DFE8F3",
"linecolor": "#A2B1C6",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "#EBF0F8",
"linecolor": "#EBF0F8",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "#EBF0F8",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Degree Distribution of Foursquare Friendship Graph Watts-Strogatz"
},
"width": 800,
"xaxis": {
"title": {
"text": "Degree"
}
},
"yaxis": {
"title": {
"text": "Number of Nodes"
}
}
}
}
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for graph in friendships_graph:\n",
"\n",
" p = G.number_of_edges() / (G.number_of_nodes())\n",
" avg_degree = int(np.mean([d for n, d in G.degree()]))\n",
" G = nx.watts_strogatz_graph(G.number_of_nodes(), avg_degree, p)\n",
" G.name = graph.name + \" - Watts-Strogatz similarity\"\n",
"\n",
"\n",
" print(G.name)\n",
" print(\"Number of nodes: \", G.number_of_nodes())\n",
" print(\"Number of edges: \", G.number_of_edges())\n",
2 years ago
" degree_distribution(G, log=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a Poissonian distribution, as expected.\n",
"\n",
"The degree distribution alone is not enough to characterize the network. There are many other quantities, such as the degree-degree correlation (between connected nodes), the spatial correlations, the clustering coefficient, the betweenness or central-ity distribution, and the self-similarity exponents."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Small-World Model\n",
"\n",
"Let's start by clarified that real networks are not random. Their formation and development are dictated by a combination of many different processes and influences. These influencing conditions include natural limitations and processes, human considerations such as optimal performance and robustness, economic considerations, natural selection and many others. Controversies still exist regarding the measure to which random models represent real-world networks. However, in this section we will focus on random network models and attempt to show if their properties may still be used to study properties of our real-world networks. \n",
"\n",
"Many real-world networks have many properties that cannot be explained by the ER model. One such property is the high clustering observed in many real-world networks. This led Watts and Strogatz to develop an alternative model, called the “small-world” model. Quoting their paper:\n",
"\n",
"> \"highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs\"\n",
"\n",
"Their idea was to begin with an ordered lattice, such as the $k$-ring (a ring where each site is connected to its $2k$ nearest neighbors - $k$ from each side) or the two-dimensional lattice. For each site, each of the links emanating from it is removed with probability $\\varphi$ and is rewired to a randomly selected site in the network. In other words, small-world networks have the unique ability to have specialized nodes or regions within a network while simultaneously exhibiting shared or distributed processing across all of the communicating nodes within a network. "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Small-Worldness"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Given the unique processing or information transfer capabilities of small-world networks, it is vital to determine whether this is a universal property of naturally occurring networks or whether small-world properties are restricted to specialized networks. An overly liberal definition of small-worldness might miss the specific benefits of these networks\n",
2 years ago
"\n",
"> high clustering and low path length\n",
"\n",
"and obscure them with networks more closely associated with regular lattices and random networks. A possible definition of a small-world network is that it has clustering similar to a regular lattice and path length similar to a random network. However, in practice, networks are typically defined as small-world by comparing clustering and path length to those of a comparable random network _(Humphries et al., 2006)_. Unfortunately, this means that networks with very low clustering can be, and indeed are, defined as small-world. We need a method that is able to distinguish true small-world networks from those that are more closely aligned with random or lattice structures and overestimates the occurrence of small-world networks. Networks that are more similar to random or lattice structures are interesting in their own right, but they do not behave like small-world networks\n",
"\n",
"## Identifying small-world networks\n",
"\n",
"Small-world networks are distinguished from other networks by two specific properties, the first being high clustering ($C$) among nodes. High clustering supports specialization as local collections of strongly interconnected nodes readily share information or resources. Conceptually, clustering is quite straightforward to comprehend. In a real-world analogy, clustering represents the probability that ones friends are also friends of each other. Small-world networks also have short path lengths ($L$) as is commonly observed in random networks. The path length is a measure of the distance between nodes in the network, calculated as the mean of the shortest geodesic distances between all possible node pairs. Small values of $L$ ensure that information or resources easily spreads throughout the network. This property makes distributed information processing possible on technological networks and supports the six degrees of separation often reported in social networks.\n",
"\n",
"Watts and Strogatz developed a network model (WS model) that resulted in the first-ever networks with clustering close to that of a lattice and path lengths similar to those of random networks. The WS model demonstrates that random rewiring of a small percentage of the edges in a lattice results in a precipitous decrease in the path length, but only trivial reductions in the clustering. Across this rewiring probability, there is a range where the discrepancy between clustering and path length is very large, and it is in this area that the benefits of small-world networks are realized.\n",
"\n",
"### A first approach: the $\\sigma$ coefficient\n",
"\n",
"In 2006, Humphries and colleagues introduced a quantitative metric, small-world coefficient $\\sigma$, that uses a ratio of network clustering and path length compared to its random network equivalent. In this quantitative approach, $C$ and $L$ are measured against those of their equivalent derived random networks ($C_{rand}$ and $L_{rand}$, respectively) to generate the ratios $c = C/C_{rand}$ and $k = L/L_{rand}$. These ratios are then used to calculate the small-coefficient as:\n",
"$$ \\sigma = \\frac{C/C_{rand}}{L/L_{rand}} = \\frac{\\gamma}{\\sigma} $$\n",
"The conditions that must be met for a network to be classified as small-world are $C \\gg C_{rand}$ and $L \\approx L_{rand}$, which results in $\\sigma > 1$. As it turns out, a major issue with $\\sigma$ is that the clustering coefficient of the equivalent random network greatly influences the small-world coefficient. In the small-world coefficient equation, $\\sigma$ uses the relationship between $C$ and $C_{rand}$ to determine the value of $\\gamma$. Because clustering in a random network is typically extremely low (Humphries and Gurney, 2008; Watts and Strogatz, 1998) the value of $\\gamma$ can be unduly influenced by only small changes in $C_{rand}$. \n",
"\n",
"### A more solid approach: the $\\omega$ coefficient\n",
"\n",
"Given a graph with characteristic path length, $L$, and clustering, $C$, the small-world measurement, $\\omega$, is defined by comparing the clustering of the network to that of an equivalent lattice network, $C_{latt}$, and comparing path length to that of an equivalent random network, $L_rand$; the relationship\n",
"is simply the difference of two ratios defined as:\n",
"$$ \\omega = \\frac{L_{rand}}{L} - \\frac{C}{C_{latt}} $$\n",
"In using the clustering of an equivalent lattice network rather than a random network, this metric is less susceptible to the fluctuations seen with $C_rand$. Moreover, values of $\\gamma$ are restricted to the interval $-1$ to $1$ regardless of network size. Values close to zero are considered small world.\n",
"\n",
"Positive values indicate a graph with more random characteristics, while negative values indicate a graph with more regular, or lattice-like, characteristics.\n",
"\n",
"#### Lattice network construction\n",
"\n",
"In the paper [1] the lattice network was generated by using a modified version of the latticization algorithm (Sporns and Zwi,2004) found in the brain connectivity toolbox (Rubinov and Sporns, 2010). The procedure is based on a Markov-chain algorithm that maintains node degree and swaps edges with uniform probability; however, swaps are carried out only if the resulting matrix has entries that are closer to the main diagonal. To optimize the clustering coefficient of the lattice network, the latticization procedure is performed over several user-defined repetitions. Storing the initial adjacency matrix and its clustering coefficient, the latticization procedure is performed on the matrix. If the clustering coefficient of the resulting matrix is lower, the initial matrix is kept and latticization is performed again on the same matrix; if the clustering coefficient is higher, then the initial adjacency matrix is replaced. This latticization process is repeated until clustering is maximized. This process results in a highly clustered network with long path length approximating a lattice topology. To decrease the processing time in larger networks, a sliding window procedure was developed. Smaller sections of the matrix are sampled along the main diagonal, latticized, and reinserted into the larger matrix in a step-wise fashion.\n",
"\n",
"#### Limitations\n",
"\n",
"The length of time it takes to generate lattice networks, particularly for large networks. Although latticization is fast in smaller networks, large networks such as functional brain networks and the Internet can take several days to generate and optimize. The latticization procedure described here uses an algorithm developed by Sporns and\n",
"Zwi in 2004, but the algorithm was used on much smaller datasets. \n",
2 years ago
"\n",
"Furthermore, $\\omega$ is limited by networks that have very low clustering that cannot be appreciably increased, such as networks with 'super hubs' or hierarchical networks. In hierarchical networks, the nodes are often configured in branches that contain little to no clustering. In networks with super hubs, the network may contain a hub that has a node with a degree that is several times in magnitude greater than the next most connected hub. In both these networks, there are fewer configurations to increase the clustering of the network. Moreover, in a targeted assault of these networks, the topology is easily destroyed (Albert et al., 2000). Such vulnerability to attack signifies a network that may not be small-world."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Omega coefficient computation\n",
"\n",
"The computation of the omega coefficient, a measure of small-worldness in networks, is a time-consuming process. To efficiently compare the clustering coefficient and the shortest path length, we constructed both the lattice reference network and the random reference network multiple times. However, given the limited resources and the utilization of Python, a strategy was employed to reduce computation time.\n",
"\n",
"This strategy consisted of:\n",
"\n",
"1. Generating a random sample of the network\n",
"2. Performing a specified number of rewiring operations per edge to compute the equivalent random graph\n",
"3. Computing the average clustering coefficient and average shortest path length for a specified number of random graphs and then averaging the results\n",
"4. Calculating the omega coefficient for the random sample with formula stated above\n",
" \n",
"Despite the aforementioned sampling technique, the computation of the omega coefficient remained computationally intensive. To mitigate over-sampling and potential bias, the computation was performed on a subset of the network with cardinality $|N|/2$. Additionally, both the number of rewiring operations per edge and the number of random graphs were set to $3$.\n",
"\n",
"Even with these optimizations, the computation of the omega coefficient required several days to complete. The computation was executed on a remote server and the results can be accessed in the form of a pandas dataframe. The results of the computation on the $3$ networks are as follows:"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"In the repository there is a python program `omega_sampled_server.py` that can be used to compute the omega coefficient for a network as described above. You can run it as follows:\n",
"\n",
"```python\n",
"./omega_sampled_server.py graph k niter nrand\n",
"\n",
"# Example:\n",
"./omega_sampled_server.py gowalla 0.5 3 3\n",
"```\n",
"\n",
"Where: \n",
"\n",
"- `graph` is the name of the graph\n",
"- `k` Percentage of nodes to be remove\n",
"- `niter` Number of rewiring operations per edge\n",
"- `nrand` Number of random graphs to be generated\n",
"\n",
"For further details run `./omega_sampled_server.py --help`\n",
"\n",
"> **NOTE:** This are slow operations, do not try to run them with higher values of k, niter or nrand. The computation of this networks with k=0.5, niter=3 and nrand=3 requires from 3 to 10 days to complete. If you want to test it out, you can use the `gowalla` graph with k=0.1, niter=1 and nrand=1.\n",
"\n",
"The advantage of using an external script rather then a block in the notebook is the ease of parallelization. You can run more scripts in parallel for different datasets. This can easily be automated with a bash script. I won't report the code since it's note relevant to the topic of this project.\n",
"\n",
"In the next section, we will see the results obtained in detail, trying to understand what they mean.\n",
"\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Are our networks small-world?\n",
"\n",
"There are multiple factors to take into consideration. Let's try to recap what we know about the networks we are working with:\n",
"\n",
"- Degree distribution\n",
"- Average clustering coefficient\n",
"- Average shortest path length\n",
"- Betweenness centrality\n",
"- Omega coefficient\n",
"\n",
"## Degree distribution\n",
"\n",
"The degree distribution of a real-world network can characterize the small-world property by showing a balance between the number of highly connected nodes (high degree) and the number of less connected nodes (low degree). A network with a small-world property will have a few highly connected nodes (hubs) and a large number of nodes with a relatively low number of connections. This creates a balance between the number of highly connected nodes and the number of less connected nodes, which allows for efficient information flow and rapid spreading of information throughout the network. Additionally, the degree distribution of a small-world network will typically follow a power-law distribution, with a few highly connected nodes and a large number of less connected nodes, further emphasizing the small-world property.\n",
"\n",
"As we have seen from the sections before, the distribution presented is far form Poissonian, and very close to a power law. However, the degree distribution alone is not enough to state that a real-world network is a small-world network because it does not take into account the specific relationships and interactions between the nodes in the network. A random network can also have a similar degree distribution, but the relationships between the nodes would be different from those in a small-world network.\n",
"\n",
"For example, a random network could be generated by randomly connecting nodes together without considering any specific relationships between them. In this case, the degree distribution may be similar to that of a social network, but the relationships between the nodes would be different.\n",
"\n",
"Additionally, to recreate this degree distribution with a random network, you can use the Barabasi-Albert model. This model generates a random network with a power-law degree distribution, which is similar to the degree distribution found in many real-world networks, including small-world networks. This model simulates the growth process of a network, where new nodes are added to the network and they preferentially connect to the existing nodes that have a high degree, this leads to a power-law degree distribution which is similar to the degree distribution of many small-world networks.\n",
"\n",
"## Betweenness centrality\n",
"\n",
"The betweenness centrality of a node in a network measures the number of times that node acts as a bridge or intermediary between other nodes in the network. In a small-world network, nodes have a high betweenness centrality because they often act as intermediaries between distant nodes, allowing for short paths and efficient communication between distant parts of the network. Therefore, a high degree of betweenness centrality in a network can be used to characterize its small-world propriety.\n",
"\n",
"To determine if the average betweenness centrality of a network is high or not we can compare it with the theoretical values of random networks. As the betweenness centrality is a measure of how much a node is used as a bridge between other nodes, random networks tend to have a low value of betweenness centrality. If the average betweenness centrality of our network is higher than the theoretical values of a random network, it can be considered a high value and therefore the network is more likely to be a small-world network.\n",
"\n",
"Let's test it out with our networks:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Creating a random graph with the Erdos-Renyi model Brightkite Checkins Graph\n",
"Number of edges in the original graph: 292973\n",
"Number of edges in the random graph: 293235\n",
"Random graph created for Brightkite Checkins Graph Starting computation of betweenness centrality...\n",
"\tNumber of nodes after removing 40.0% of nodes: 3896\n",
"\tNumber of edges after removing 40.0% of nodes: 105312\n",
"\tBetweenness centrality for Erdos-Renyi random graph: 0.0003728834232472551\n",
"\n",
"Creating a random graph with the Erdos-Renyi model Gowalla Checkins Graph\n",
"Number of edges in the original graph: 62790\n",
"Number of edges in the random graph: 62918\n",
"Random graph created for Gowalla Checkins Graph Starting computation of betweenness centrality...\n",
"\tNumber of nodes after removing 40.0% of nodes: 1844\n",
"\tNumber of edges after removing 40.0% of nodes: 22740\n",
"\tBetweenness centrality for Erdos-Renyi random graph: 0.0009215261155179815\n",
"\n",
"Creating a random graph with the Erdos-Renyi model Foursquare Checkins Graph\n",
"Number of edges in the original graph: 246702\n",
"Number of edges in the random graph: 247301\n",
"Random graph created for Foursquare Checkins Graph Starting computation of betweenness centrality...\n",
"\tNumber of nodes after removing 40.0% of nodes: 1395\n",
"\tNumber of edges after removing 40.0% of nodes: 88929\n",
"\tBetweenness centrality for Erdos-Renyi random graph: 0.0006522226121634739\n",
"\n",
"Creating a random graph with the Erdos-Renyi model Brightkite Friendship Graph\n",
"Number of edges in the original graph: 14690\n",
"Number of edges in the random graph: 14749\n",
"Random graph created for Brightkite Friendship Graph Starting computation of betweenness centrality...\n",
"\tNumber of nodes after removing 40.0% of nodes: 3252\n",
"\tNumber of edges after removing 40.0% of nodes: 5346\n",
"\tBetweenness centrality for Erdos-Renyi random graph: 0.0016407812858385549\n",
"\n",
"Creating a random graph with the Erdos-Renyi model Gowalla Friendship Graph\n",
"Number of edges in the original graph: 5548\n",
"Number of edges in the random graph: 5419\n",
"Random graph created for Gowalla Friendship Graph Starting computation of betweenness centrality...\n",
"\tNumber of nodes after removing 40.0% of nodes: 1377\n",
"\tNumber of edges after removing 40.0% of nodes: 1952\n",
"\tBetweenness centrality for Erdos-Renyi random graph: 0.0037251547240147328\n",
"\n",
"Creating a random graph with the Erdos-Renyi model Foursquare Friendship Graph\n",
"Number of edges in the original graph: 5323\n",
"Number of edges in the random graph: 5214\n",
"Random graph created for Foursquare Friendship Graph Starting computation of betweenness centrality...\n",
"\tNumber of nodes after removing 40.0% of nodes: 839\n",
"\tNumber of edges after removing 40.0% of nodes: 1828\n",
"\tBetweenness centrality for Erdos-Renyi random graph: 0.0042446600624415146\n",
"\n"
]
}
],
"source": [
"# As said before, for a quick testing I suggest to use k=0.6 and at least k=0.4 for accurate results\n",
"\n",
"# uncomment the model that you want to use for the random graphs\n",
"\n",
"# model_name = 'watts_strogatz'\n",
"model_name = 'erdos_renyi'\n",
"\n",
"random_graphs = {}\n",
"for graph in graphs_all:\n",
" G = create_random_graphs(graph, model=model_name, save = False)\n",
" print(\"Random graph created for \", graph.name, \"Starting computation of betweenness centrality...\")\n",
" betweenness_centrality = np.mean(list(betweenness_centrality_parallel(G, 6, k = 0.4).values()))\n",
" print(\"\\tBetweenness centrality for Erdos-Renyi random graph: \", betweenness_centrality)\n",
" random_graphs[graph.name] = betweenness_centrality\n",
" print(\"\")"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'Brightkite Checkins Graph': 0.0003728834232472551,\n",
" 'Gowalla Checkins Graph': 0.0009215261155179815,\n",
" 'Foursquare Checkins Graph': 0.0006522226121634739,\n",
" 'Brightkite Friendship Graph': 0.0016407812858385549,\n",
" 'Gowalla Friendship Graph': 0.0037251547240147328,\n",
" 'Foursquare Friendship Graph': 0.0042446600624415146}"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"random_graphs"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABcIAAAPdCAYAAACp3hugAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAC/xUlEQVR4nOzdebRVdf0//udluoAMCsqkTE4ozoIilAIOCA45lZSKouZsikP5dcoh09TSMgeyHLIU8ZNj5gAOkCkqOKCpOYVCCQ4o4AQI7N8fLs7P6wW8Vy5ePT0ea521Ou/92u/92vuefaqn2/epKIqiCAAAAAAAlKkG9d0AAAAAAACsSIJwAAAAAADKmiAcAAAAAICyJggHAAAAAKCsCcIBAAAAAChrgnAAAAAAAMqaIBwAAAAAgLImCAcAAAAAoKwJwgEAAAAAKGuCcAD4Brn22mtTUVFR5bXaaqtlwIABufPOO7/0vJdffnmuvfbaumuUWpkzZ05+/vOfp3fv3mnVqlUqKyvTrVu3HHTQQXnyySdX6LFvuOGG/PrXv15h83fr1i3Dhw8vvX/ttddSUVFR5fP2yCOP5Mwzz8ysWbNWWB9f5Le//W3WXnvtNGnSJBUVFUvtZVm9duvWLbvsssuKbbSODBgwIAMGDPhS+5555pmpqKio24aWoKKiImeeeeYKP86K8k35PCz+75XXXnttmXV33XXXUv8eFRUVOfroo+u+ua+hml4v/rc+FwB8MwjCAeAb6JprrsmECRPyyCOP5Morr0zDhg2z66675q9//euXmk8QXn9effXVbLbZZvnFL36RgQMHZtSoURkzZkzOOuusvPnmm+nVq1dmz569wo6/ooPwz+vYsWMmTJiQnXfeuTT2yCOP5Kyzzqq3IPzpp5/OMccck4EDB+aBBx7IhAkT0rJlyyXW1nevdeXyyy/P5Zdf/qX2/eEPf5gJEybUcUd83d11110566yz6rsNAIAvrVF9NwAA1N6GG26Y3r17l94PHjw4q6yySkaNGpVdd921HjujNhYuXJg99tgj77zzTiZMmJANN9ywtK1///454IADcvfdd6dx48b12OX/b+HChVmwYEEqKyu/9ByVlZXZaqut6rCr5ffcc88lSQ455JBsueWW9dzNivXRRx+lefPm6dmz55eeY4011sgaa6xRh13Vn8XXg2+moigyd+7cNGvWrL5b+cZyDwDwv8QT4QBQBpo2bZomTZpUC0znz5+fc845J+utt14qKyuz2mqr5cADD8zbb79dqunWrVuee+65jB8/vrTcSrdu3VIURdq3b5+jjjqqVLtw4cKsssoqadCgQd58883S+EUXXZRGjRpVeUp20qRJ+c53vpM2bdqkadOm2WyzzXLTTTdV633GjBk57LDDssYaa6RJkybp3r17zjrrrCxYsKBUs3g5jV/+8pe56KKL0r1797Ro0SJ9+/bNo48+WmW+4cOHp0WLFnnllVey0047pUWLFuncuXNOOOGEzJs3r9bXJ0keeOCBDBgwIG3btk2zZs3SpUuX7LXXXvnoo49KNVdccUU22WSTtGjRIi1btsx6662XU045ZVl/ttx222159tlnc/LJJ1cJwT9ryJAhVUKKl19+Ofvss0/atWuXysrKrL/++rnsssuq7DNu3LhUVFRk1KhROfXUU9OpU6e0atUq22+/fV588cVS3YABA/K3v/0tr7/+epXldj57zS+44IKcc8456d69eyorK/Pggw9m7ty5OeGEE7LpppumdevWadOmTfr27Zvbb799mef72XkX/xsIZ555Zn784x8nSbp3717qYdy4cTn44IPTpk2bKtd5sW233TYbbLDBFx7v6quvziabbJKmTZumTZs22WOPPfLCCy9UuQb77bdfkqRPnz6pqKiospTLZy2r18+65557svnmm6dZs2ZZb731cvXVV1ebqyaf+6VZtGhRLrjggtLntl27dtl///3zn//8p0rdgAEDsuGGG+bvf/97+vXrl+bNm+eggw4qbfv80ij/+c9/8t3vfjctW7bMyiuvnH333TcTJ06stpTNkpZGWbwMyBed+9tvv50jjzwyPXv2TIsWLdKuXbtsu+22eeihh77wvJempn0v/m549tlnM2jQoLRs2TLbbbddkmTs2LHZbbfdssYaa6Rp06ZZe+21c9hhh+Wdd96pcqzF5/7UU09lzz33TKtWrdK6devst99+1b43FqvJ52FJzjrrrPTp0ydt2rRJq1atsvnmm+eqq65KURRV6mp67ZPk0Ucfzbe+9a00bdo0nTp1ysknn5xPPvnkC3sZPnx46Xvms98Vn18e5E9/+lPWX3/9NG/ePJtssskSl+yqyXfY0ixeamPkyJFZf/31U1lZmT/+8Y/1fr1qe09OmDAh/fr1S7NmzdKtW7dcc801SZK//e1v2XzzzdO8efNstNFGueeee2p0XZ577rkMGjQozZs3z2qrrZajjjoqf/vb36p9Py3rO2H06NEZNGhQOnbsmGbNmmX99dfP//t//y8ffvhhlWMtvo+ee+65bLfddllppZWy2mqr5eijj17id3VSs88FAHwlCgDgG+Oaa64pkhSPPvpo8cknnxTz588vpk2bVhxzzDFFgwYNinvuuadUu3DhwmLw4MHFSiutVJx11lnF2LFjiz/84Q/F6quvXvTs2bP46KOPiqIoiieffLJYc801i80226yYMGFCMWHChOLJJ58siqIovv/97xfrrrtuac5HH320SFI0a9asuP7660vjQ4YMKbbccsvS+wceeKBo0qRJsfXWWxejR48u7rnnnmL48OFFkuKaa64p1U2fPr3o3Llz0bVr1+J3v/tdcd999xU/+9nPisrKymL48OGluilTphRJim7duhWDBw8ubrvttuK2224rNtpoo2KVVVYpZs2aVao94IADiiZNmhTrr79+8ctf/rK47777ip/+9KdFRUVFcdZZZ9X6+kyZMqVo2rRpscMOOxS33XZbMW7cuOL6668vhg0bVrz33ntFURTFqFGjiiTFj370o2LMmDHFfffdV4wcObI45phjlvn3PPTQQ4skxQsvvPCFf/uiKIrnnnuuaN26dbHRRhsV1113XTFmzJjihBNOKBo0aFCceeaZpboHH3ywdL323Xff4m9/+1sxatSookuXLsU666xTLFiwoDTft771raJDhw6lv/2ECROqXPPVV1+9GDhwYPGXv/ylGDNmTDFlypRi1qxZxfDhw4s//elPxQMPPFDcc889xYknnlg0aNCg+OMf/1il565duxYHHHBAtb/l4s/BtGnTih/96EdFkuKWW24p9TB79uxi8uTJRZLi97//fbXrkKS47LLLlnm9zj333CJJ8YMf/KD429/+Vlx33XXFmmuuWbRu3bp46aWXSnOddtpppZ4mTJhQvPLKK0ucb1m9Lj7XNdZYo+jZs2dx3XXXFffee2/xve99r0hSjB8/vjRPTT/3S7P4c3P00UcX99xzTzFy5MhitdVWKzp37ly8/fbbpbr+/fsXbdq0KTp37lz89re/LR588MFSH/379y/69+9fqv3ggw+Ktddeu2jTpk1x2WWXFffee29x3HHHFd27d692355xxhnF5/9vRE3P/V//+ldxxBFHFDfeeGMxbty44s477ywOPvjgokGDBsWDDz5YZc4kxRlnnLHMa1Gbvg844ICicePGRbdu3YrzzjuvuP/++4t77723KIqiuOKKK4rzzjuvuOOOO4rx48cXf/zjH4tNNtmk6NGjRzF//vxq5961a9fixz/+cXHvvfcWF110UbHSSisVm222WZXaml6TpRk+fHhx1VVXFWPHji3Gjh1b/OxnPyuaNWtW5XusNsd57rnniubNmxc9e/YsRo0aVdx+++3FjjvuWHTp0qVIUkyZMmWpvbzyyivFd7/73SJJle+KuXPnFkVRlL5vttxyy+Kmm24q7rrrrmLAgAFFo0aNildffbVKDzX5Dluaxd9JG2+8cXHDDTcUDzzwQPHPf/6z3q9Xbe7Jtm3bFj169Ciuuuqq4t577y122WWXIklx1llnFRtttFExatSo4q677iq22mqrorK
"text/plain": [
"<Figure size 1500x1000 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"fig, ax = plt.subplots(figsize=(15, 10))\n",
"index = np.arange(len(random_graphs))\n",
"bar_width = 0.35\n",
"opacity = 0.8\n",
"\n",
"rects1 = plt.bar(index, analysis_results['betweenness centrality'], bar_width,\n",
"alpha=opacity,\n",
"color='b',\n",
"label='Original Graph')\n",
"\n",
"rects2 = plt.bar(index + bar_width, random_graphs.values(), bar_width,\n",
"alpha=opacity,\n",
"color='g',\n",
"label='Random Graph')\n",
"\n",
"plt.xlabel('Graph')\n",
"plt.ylabel('Betweenness Centrality')\n",
"plt.title('Betweenness Centrality of the original graph and the random graph')\n",
"plt.xticks(index + bar_width, random_graphs.keys())\n",
"plt.legend()\n",
"\n",
"plt.tight_layout()\n",
"plt.show()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"As we can see, there is a clear difference between the betweenness centrality of the networks generated from the checkins and the networks generated from the friendships. Since the values of the betweenness centrality of the networks generated from the checkins are higher than the theoretical values of a random network, we can conclude that the networks generated from the checkins are more likely to be a small-world network. On the other hand, the networks generated from the friendships have a lower value of betweenness centrality than the theoretical values of a random network, therefore we can conclude that the networks generated from the friendships are less likely to be a small-world network.\n",
"\n",
"This propriety appears both with the erdos-renyi and the watts-strogatz models."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clustering coefficient\n",
"\n",
"The simplest way to treat clustering analytically in a small-world network is to use the link addition, rather than the rewiring model. In the limit of large network size, $N \\to \\infty$, and for a fixed fraction of shortcuts $\\phi$, it is clear that the probability of forming triangle vanishes as we approach $1/N$, so the contribution of the shortcuts to the clustering is negligible. Therefore, the clustering of a small-world network is determined by its underlying ordered lattice. For example, consider a ring where each node is connected to its $k$ closest neighbors from each side. A node's number of neighbors is therefore $2k$, and thus it has $2k(2k - 1)/2 = k(2k - 1)$ pairs of neighbors. Consider a node, $i$. All of the $k$ nearest nodes on $i$'s left are connected to each other, and the same is true for the nodes on $i$'s right. This amounts to $2k(k - 1)/2 = k(k - 1)$ pairs. Now consider a node located $d$ places to the left of $k$. It is also connected to its $k$ nearest neighbors from each side. Therefore, it will be connected to $k - d$ neighbors on $i$'s right side. The total number of connected neighbor pairs is\n",
"\n",
"\\begin{equation}\n",
" k(k-1) + \\sum_{d=1}^k (k-d) = k(k-1) + \\frac{k(k-1)}{2} = \\frac{3}{2} k (k-1)\n",
"\\end{equation}\n",
"\n",
"and the clustering coefficient is:\n",
"\n",
"\\begin{equation}\n",
" C = \\frac{\\frac{3}{2}k(k-1)}{k(2k-1)} =\\frac{3 (k-1)}{2(2k-1)}\n",
"\\end{equation}\n",
"\n",
"For every $k > 1$, this results in a constant larger than $0$, indicating that the clustering of a small-world network does not vanish for large networks. For large values of $k$, the clustering coefficient approaches $3/4$, that is, the clustering is very high. Note that for a regular two-dimensional grid, the clustering by definition is zero, since no triangles exist. However, it is clear that the grid has a neighborhood structure.\n",
"\n",
"\n",
"--- \n",
"\n",
"We can compare the results of the clustering coefficient that we obtained with the standard formula, and the one that we obtained with the formula above. We can do that with the function `generalized_average_clustering_coefficient` in the `utils.py` file. The function takes as input a networkx graph object and returns a float: the average clustering coefficient of the graph."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'Brightkite Checkins Graph': 0.6426519071903248,\n",
" 'Gowalla Checkins Graph': 0.6159366386543966,\n",
" 'Foursquare Checkins Graph': 0.6949399573838294,\n",
" 'Brightkite Friendship Graph': 0.4044470961191924,\n",
" 'Gowalla Friendship Graph': 0.4228365321024048,\n",
" 'Foursquare Friendship Graph': 0.4585372995852263}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"generalized_cc = {}\n",
"for graph in graphs_all:\n",
" generalized_cc[graph.name] = generalized_average_clustering_coefficient(graph)\n",
"\n",
"generalized_cc"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAABcIAAAPdCAYAAACp3hugAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAACzZElEQVR4nOzdebxV8/4/8PduOEOqQ6M0UzIkQ91ShmQI4crlitySEDLcJtPXmClzyKXrGsJ1zXRJSqZEcYvMYxlCpRSlqHTO+v3h0fk5TuXs2sfR9nw+Hudx7/6sz1rrvdZee+28zud8VipJkiQAAAAAACBLVaroAgAAAAAAoDwJwgEAAAAAyGqCcAAAAAAAspogHAAAAACArCYIBwAAAAAgqwnCAQAAAADIaoJwAAAAAACymiAcAAAAAICsJggHAAAAACCrCcIByDo33HBDpFKpaN26dUWX8rtUVFQUd999d+y9995Rp06dqFq1atSrVy8OPPDAePzxx6OoqCgiIj799NNIpVIxatSocqlj9uzZceGFF8brr79eLtsfNWpUpFKp+PTTT8tl+2Xx5ptvxjHHHBPNmzePvLy8qF69euy0005x5ZVXxsKFC8t13/fff39su+22kZ+fH6lUqvg8jxgxIlq0aBE5OTmRSqXi22+/jT59+kSzZs3S3scee+wRe+yxR0br/qV33303LrzwwnJ5H5955plo165dbLTRRpFKpWL06NGr7be2a7VPnz5RvXr1jNdWHi688MJIpVLrtO7zzz8fqVQqnn/++cwW9Qu/xTVVnjaU66Gs9/e1ff722GMP37MZlkql4sILLyx+/Vt97n5pfe4Vvwer6v/6668ruhQAfmcE4QBkndtvvz0iIt5555145ZVXKria35dly5ZFt27d4uijj4569erFzTffHM8++2yMHDkyNttss/jrX/8ajz/++G9Sy+zZs2Po0KHlFoQfcMABMWXKlGjQoEG5bP/X/Otf/4q2bdvG1KlT4/TTT49x48bFo48+Gn/9619j5MiRceyxx5bbvufPnx+9evWKLbbYIsaNGxdTpkyJLbfcMl5//fU47bTTokuXLvHss8/GlClTokaNGnHeeefFo48+mvZ+brrpprjpppvK4Qj+v3fffTeGDh2a8SA8SZI4/PDDo2rVqvHYY4/FlClTonPnzqvtW97X6m/luOOOiylTpqzTujvttFNMmTIldtpppwxXxe9ZeX3+KBufOwDIrCoVXQAAZNK0adPijTfeiAMOOCCeeOKJuO2226JDhw6/aQ1JksSyZcsiPz//N91vWQwaNCjGjx8fd955Z/Tu3bvEsr/85S9x+umnxw8//FBB1WXGDz/8EHl5eVG3bt2oW7duhdQwZcqUOOmkk2KfffaJ0aNHR25ubvGyffbZJwYPHhzjxo0rt/1/+OGH8eOPP8bf/va3EuHuO++8ExERxx9/fLRv3764fYsttlin/WyzzTbrV2gFmj17dixcuDAOOeSQ2GuvvSq6nHL1/fffR7Vq1aJRo0bRqFGjddpGzZo1Y+edd85wZRVj1T1iQx7xyu/fjz/+GKlUKqpUWff/5M6mz90vZeL8AEC6jAgHIKvcdtttERFx+eWXR6dOneK+++6L77//PiJ++o+uevXqRa9evUqt9+2330Z+fn4MGjSouG3x4sUxZMiQaN68eeTk5ETDhg1jwIABsXTp0hLrplKpOOWUU2LkyJGx9dZbR25ubtx5550RETF06NDo0KFD1KpVK2rWrBk77bRT3HbbbZEkSYltLF++PAYPHhybbrppVKtWLXbfffd49dVXo1mzZtGnT58SfefOnRsnnHBCNGrUKHJycqJ58+YxdOjQWLly5VrPzdy5c+PWW2+Nfffdt1QIvkrLli2jTZs2a9zGmqbQWN2fUT/44IPRoUOHKCgoiGrVqsXmm28effv2jYif/tz7T3/6U0REHHPMMZFKpUr9Sfi0adPiz3/+c9SqVSvy8vJixx13jAceeKDEPlZNf/LUU09F3759o27dulGtWrVYvnz5aqdGWfWn/FOnTo3ddtutuK7LL7+8eEqYVd55553o2rVrVKtWLerWrRsnn3xyPPHEE2X6M/XLLrssUqlU3HLLLSVC8FVycnLiz3/+c/HroqKiuPLKK2OrrbaK3NzcqFevXvTu3Tu++OKLUus+/fTTsddee0XNmjWjWrVqscsuu8QzzzxTvLxPnz6x6667RkREjx49IpVKFU838be//S0iIjp06BCpVKr42lrd+1pUVBQjRoyIHXbYIfLz82PjjTeOnXfeOR577LES5/OX01isWLEiLrnkkuJjqVu3bhxzzDExf/78Ev2aNWsWBx54YIwbNy522mmnyM/Pj6222qr4Lzoifnp///rXv0ZERJcuXYqvk1+bzuHFF1+MvfbaK2rUqBHVqlWLTp06xRNPPFG8/MILLywOhM8888xIpVJrnBqmLNdqRMSMGTOiW7duUb169WjcuHEMHjw4li9fvk7nZk0ee+yx6NixY1SrVi1q1KgR++yzT6kR3qs+i6+99locdthhsckmmxT/omN1n9Oy3ntWN0XDqmlAynLsZb0XllVZ617bPWLGjBlxzDHHRMuWLaNatWrRsGHDOOigg+Ktt94qsa9Vx/7vf/87Bg0aFJtuumnk5+dH586dY/r06autryznZHXuv//+6Nq1azRo0CDy8/Nj6623jrPOOqvU904653727Nlx+OGHR40aNaKgoCB69OgRc+fO/dVayvr5K8v9tKzfp6uTJElcdtll0bRp08jLy4t27drFhAkTVnv/Sfd7++67746tt946qlWrFttvv32MGTOm1P4/+uij6NmzZ9SrVy9yc3Nj6623jn/84x8l+qy6Ru6+++4YPHhwNGzYMHJzc2PGjBkxf/786N+/f2yzzTZRvXr1qFevXuy5554xadKkXz32X37uVk1ps6afn/u174pVnnjiidhhhx0iNzc3mjdvHldfffWv1rVKWd+bTJyfVcd+5ZVXxqWXXhpNmjQp3ufqjisi4quvvoojjzwyCgoKon79+tG3b99YtGhRmY8PgCyUAECW+P7775OCgoLkT3/6U5IkSXLrrbcmEZGMGjWquM/AgQOT/Pz8ZNGiRSXWvemmm5KISN58880kSZJk6dKlyQ477JDUqVMnufbaa5Onn346uf7665OCgoJkzz33TIqKiorXjYikYcOGSZs2bZL//Oc/ybPPPpu8/fbbSZIkSZ8+fZLbbrstmTBhQjJhwoTk4osvTvLz85OhQ4eW2P+RRx6ZVKpUKTnrrLOSp556KrnuuuuSxo0bJwUFBcnRRx9d3G/OnDlJ48aNk6ZNmyb//Oc/k6effjq5+OKLk9zc3KRPnz5rPT//+c9/kohIbr755jKdz08++SSJiOSOO+4objv66KOTpk2blup7wQUXJD//Z8XkyZOTVCqVHHHEEcnYsWOTZ599NrnjjjuSXr16JUmSJIsWLUruuOOOJCKSc889N5kyZUoyZcqU5PPPP0+SJEmeffbZJCcnJ9ltt92S+++/Pxk3blzSp0+fUvWs2kbDhg2Tfv36JU8++WTy0EMPJStXrixe9sknnxT379y5c1K7du2kZcuWyciRI5MJEyYk/fv3TyIiufPOO4v7zZ49O6ldu3bSpEmTZNSoUcnYsWOTXr16Jc2aNUsiInnuuefWeN5WrlyZVKtWLenQoUOZznOSJEm/fv2SiEhOOeWUZNy4ccnIkSOTunXrJo0bN07mz59f3O/uu+9OUqlU0r179+SRRx5JHn/88eTAAw9MKleunDz99NNJkiTJjBkzkn/84x9JRCSXXXZZMmXKlOSdd95J3nnnneTcc88tPodTpkxJZsyYscb3tVevXkkqlUqOO+645L///W/y5JNPJpdeemly/fXXlzifnTt3Ln5dWFiY7LfffslGG22UDB06NJkwYUJy6623Jg0bNky22Wab5Pvvvy/u27Rp06RRo0bJNttsk9x1113J+PHjk7/+9a9JRCQTJ05MkiRJ5s2bl1x22WVJRCT/+Mc/iq+TefPmrfFcPv/880nVqlWTtm3bJvfff38
"text/plain": [
"<Figure size 1500x1000 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# now we can compare the results of the generalized average clustering coefficient with the original average clustering coefficient. Use matplotlib to plot the results as an histogram with two bars for each graph\n",
"\n",
"import matplotlib.pyplot as plt\n",
"\n",
"fig, ax = plt.subplots(figsize=(15, 10))\n",
"index = np.arange(len(generalized_cc))\n",
"bar_width = 0.35\n",
"opacity = 0.8\n",
"\n",
"rects1 = plt.bar(index, analysis_results['Average Clustering Coefficient'], bar_width,\n",
"alpha=opacity,\n",
"color='b',\n",
"label='Original Graph')\n",
"\n",
"rects2 = plt.bar(index + bar_width, generalized_cc.values(), bar_width,\n",
"alpha=opacity,\n",
"color='g',\n",
"label='Generalized Graph')\n",
"\n",
"plt.xlabel('Graph')\n",
"plt.ylabel('Average Clustering Coefficient')\n",
"plt.title('Average Clustering Coefficient of the original graph and the generalized graph')\n",
"plt.xticks(index + bar_width, generalized_cc.keys())\n",
"plt.legend()\n",
"\n",
"plt.tight_layout()\n",
"plt.show()\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"As we can see, for the graphs generated from the checkins, the two values are very similar. However, for the graphs generated from the friendships, the values are very different. This is another suggestion that the checkins graphs are more likely to be a small-world network than the friendships graphs. \n",
"\n",
"But this is not enough to jump to conclusions"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Omega coefficient\n",
"\n",
"We have already discussed a lot in the previous sections about this measure, let's see the results that we obtained after days of computations on the server:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Graph</th>\n",
" <th>omega-coefficient</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Brightkite Checkins Graph</td>\n",
" <td>-0.180</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Gowalla Checkins Graph</td>\n",
" <td>-0.240</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Foursquare Checkins Graph</td>\n",
" <td>-0.056</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Brightkite Friendship Graph</td>\n",
" <td>-0.200</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Gowalla Friendship Graph</td>\n",
" <td>-0.250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Foursquare Friendship Graph</td>\n",
" <td>-0.170</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Graph omega-coefficient\n",
"0 Brightkite Checkins Graph -0.180\n",
"1 Gowalla Checkins Graph -0.240\n",
"2 Foursquare Checkins Graph -0.056\n",
"3 Brightkite Friendship Graph -0.200\n",
"4 Gowalla Friendship Graph -0.250\n",
"5 Foursquare Friendship Graph -0.170"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"analysis_results[['Graph', 'omega-coefficient']]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This results are a bit of a surprise. The small-world coefficient (omega) measures how much a network is like a lattice or a random graph. Negative values mean G is similar to a lattice whereas positive values mean G is a random graph. Values close to 0 mean that G has small-world characteristics.\n",
"\n",
"Based only on this metric, we may conclude that all the networks are small-worlds. In fact, all the values of the omega coefficient are $~0.2$ (with the exception of the foursquare checkins graph, whose value is very close to $0$). However, I don't this this is the case. \n",
"\n",
"# Conclusion\n",
"\n",
"We have seen in the previous section that the $\\omega$ coefficient can be tricked by networks that have a very low clustering coefficient, and in my opinion this is excatly what is happening here. The networks generated from the friendships have a very low clustering coefficient, and therefore they are biasing the $\\omega$ coefficient. This conclusion is supported by the fact the measures like the betweenness centrality and the clustering coefficient that we have shown before, suggest that the networks generated from the friendships are not small-world networks. \n",
"\n",
"Furthermore, on a more euristic level, those graphs represent a social network with data taken in 2010, a time when social networks were not as popular as they are today. Therefore, I would not be surprised if those networks were not small-world networks. \n",
"\n",
"This study evidences why the charaterization of the small-world propriety of a real-world network is still subject of debate. Even if we have used the most reliable techniques that the literature has to offer, we still have not been able to reach a definitive conclusion."
]
}
],
"metadata": {
"kernelspec": {
2 years ago
"display_name": "Python 3.10.8 64-bit",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
2 years ago
"hash": "e7370f93d1d0cde622a1f8e1c04877d8463912d04d973331ad4851f04de6915a"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}