% The PageRank model was proposed by Google in a series of papers to evaluate accurately the most important web-pages from the World Wide Web matching a set of keywords entered by a user. For search engine rankings, the importance of web-pages is computed from the stationary probability vector of the random process of a web surfer who keeps visiting a large set of web-pages connected by hyperlinks. The link structure of the World Wide Web is represented by a directed graph, the so-called web link graph, and its corresponding adjacency matrix $G \in \N^{n \times n}$ where $n$ denotes the number of pages and $G_{ij}$ is nonzero (being 1) only if the \emph{jth} page has a hyperlink pointing to the \emph{ith} page. The transition probability matrix $P \in \R^{n \times n}$ of the random process has entries as described in \ref{eq:transition}.
% write the paragraph above, with different words
The PageRank algorithm is a method developed by Google to determine the relevance of web pages to specific keywords. It is used to rank search results based on the importance of the pages, as determined by the probability that a web surfer will visit them. The algorithm works by representing the links between web pages as a directed graph, with each page represented by a vertex and each link represented by an edge. The importance of a page is then determined by the number of links pointing to it and the importance of the pages that link to it. The PageRank algorithm is based on the idea that a page is more likely to be important if it is linked to by other important pages, and it is represented mathematically by a transition probability matrix $P \in\R^{n \times n}$, which can be calculated using the formula in equation \ref{eq:transition}, where we consider its adjacency matrix $G \in\N^{n \times n}$, where $n$ is the number of pages and $G_{ij}$ is nonzero (being 1) only if the \emph{j-th} page has a hyperlink pointing to the \emph{i-th} page.
\noindent The entire random process needs a unique stationary distribution. To ensure this propriety is satisfied, the transition matrix $P$ is usually modified to be an irreducible stochastic matrix $A$ (called the Google matrix) as follows:
% \noindent To ensure that the random process has a unique stationary distribution and it will not stagnate, the transition matrix P is usually modified to be an irreducible stochastic matrix $A$ (called the Google matrix) as follows
\begin{equation}\label{eq:google}
A = \alpha\tilde P + (1 - \alpha)v e^T
\end{equation}
\noindent In \ref{eq:google} we have defined a new matrix called $\tilde P = P + vd^T$ where $d \in N^{n \times1}$ is a binary vector tracing the indices of the damping web pages with no hyperlinks, i.e., $d(i)=1$ if the \emph{i-th} page has no hyperlink, $v \in\R^{n \times n}$ is a probability vector, $e =[1, 1, ... ,1]^T$ and $0<\alpha<1$, the so-called damping factor that represents the probability in the model that the surfer transfer by clicking a hyperlink rather than other ways. Mathematically, the PageRank model can be formulated as the problem of finding the positive unit eigenvector $x$ (the so-called PageRank vector) such that
% \noindent The authors of the paper \cite{SHEN2022126799} emphasize how in the in the past decade or so, considerable research attention has been devoted to the efficient solution of problems \ref{eq:pr} \ref{eq:pr2}, especially when $n$ is very large. For moderate values of the damping factor, e.g. for $\alpha = 0.85$ as initially suggested by Google for search engine rankings, solution strategies based on the simple Power method have proved to be very effective. However, when $\alpha$ approaches 1, as is required in some applications, the convergence rates of classical stationary iterative methods including the Power method tend to deteriorate sharply, and more robust algorithms need to be used. \vspace*{0.4cm}
\noindent In recent years, there has been a lot of interest in finding efficient ways to solve problems \ref{eq:pr} and \ref{eq:pr2}, especially when the number of variables (denoted by $n$) is very large. For moderate values of the damping factor (e.g., $\alpha=0.85$, as suggested by Google for search engine rankings), the Power method has proven to be a reliable solution. However, when the damping factor gets closer to 1, which is necessary in some cases, traditional iterative methods like the Power method may not work as well and more robust algorithms may be required. This point is emphasized in the paper \cite{SHEN2022126799}. \vspace*{0.4cm}
\noindent In the reference paper that we are using for this project, the authors focus their attention in the area of PageRank computations with the same network structure but multiple damping factors. For example, in the Random Alpha PageRank model used in the design of anti-spam mechanism \cite{Constantine2009Random}, the rankings corresponding to many different damping factors close to 1 need to be computed simultaneously. They explain that the problem can be expressed mathematically as solving a sequence of linear systems
\begin{equation}\label{eq:pr3}
(I - \alpha_i \tilde P)x_i = (1 - \alpha_i)v \quad\alpha_i \in (0, 1) \quad\forall i \in\{1, 2, ..., s\}
% As we know, standard PageRank algorithms applied to \ref{eq:pr3} would solve the $s$ linear systems independently. Although these solutions can be performed in parallel, the process would still demand large computational resources for high dimension problems.
% This consideration motived the authors to search novel methods with reduced algorithmic and memory complexity, to afford the solution of larger problems on moderate computing resources. They suggest to write the PageRank problem with multiple damping factors given at once \ref{eq:pr3} as a sequence of shifted linear systems of the form:
Traditionally, PageRank algorithms applied to problem \ref{eq:pr3} would involve solving multiple linear systems independently. While this process can be parallelized to some extent, it can still be computationally intensive for high-dimensional problems. In an effort to find more efficient methods with lower algorithmic and memory complexity, the authors of the paper searched for alternative approaches that would allow them to solve larger problems with moderate computing resources. They proposed expressing the PageRank problem with multiple damping factors given in \ref{eq:pr3} as a series of shifted linear systems, in the form described in the following equation. This approach aims to reduce the computational demands of the problem.
% We know from literature that the Shifted Krylov methods may still suffer from slow convergence when the damping factor approaches 1, requiring larger search spaces to converge with satisfactory speed. In \cite{SHEN2022126799} is suggest that, to overcome this problem, we can combine stationary iterative methods and shifted Krylov subspace methods. They derive an implementation of the Power method that solves the PageRank problem with multiple dumpling factors at almost the same computational time of the standard Power method for solving one single system. They also demonstrate that this shifted Power method generates collinear residual vectors. Based on this result, they use the shifted Power iterations to provide smooth initial solutions for running shifted Krylov subspace methods such as \texttt{GMRES}. Besides, they discuss how to apply seed system choosing strategy and extrapolation techniques to further speed up the iterative process.
It has been previously noted in the literature that the Shifted Krylov methods may have slow convergence when the damping factor gets close to 1, requiring a larger search space to achieve satisfactory speed. In order to address this issue, the authors of \cite{SHEN2022126799} suggest combining stationary iterative methods with shifted Krylov subspace methods. They present an implementation of the Power method that can solve the PageRank problem with multiple damping factors in approximately the same amount of time as the standard Power method for solving a single system. They also show that this shifted Power method generates collinear residual vectors. Based on this result, they use the shifted Power iterations to provide smooth initial solutions for running shifted Krylov subspace methods such as \texttt{GMRES}. In addition, they discuss how techniques such as seed system choosing and extrapolation can be used to further accelerate the iterative process.
% As an attempt of a possible remedy in this situation, we present a framework that combines. shifted stationary iterative methods and shifted Krylov subspace methods. In detail, we derive the implementation of the Power method that solves the PageRank problem with multiple damping factors at almost the same computational cost of the standard Power method for solving one single system. Furthermore, we demonstrate that this shifted Power method generates collinear residual vectors. Based on this result, we use the shifted Power iterations to provide smooth initial solutions for running shifted Krylov subspace methods such as GMRES. Besides, we discuss how to apply seed system choosing strategy and extrapolation techniques to further speed up the iterative process.
\subsection{Overview of the classical PageRank problem}
% The Power method is considered one of the algorithms of choice for solving either the eigenvalue \ref{eq:pr} or the linear system \ref{eq:pr2} formulation of the PageRank problem, as it was originally used by Google. Power iterations write as
The Power method is a popular algorithm for solving either the eigenvalue problem in equation \ref{eq:pr} or the linear system in equation \ref{eq:pr2}, which were originally used to calculate PageRank by Google. It works by iteratively applying the matrix $A$ to an initial estimate of the solution, using the following formula:
x_{(k+1)} = Ax_k =\alpha\tilde P x_{(k)} + (1 - \alpha)v
\end{equation}
The convergence behavior is determined mainly by the ratio between the two largest eigenvalues of A. When $\alpha$ gets closer to $1$, though, the convergence can slow down significantly. \\
\noindent As stated in \cite{SHEN2022126799} The number of iterations required to reduce the initial residual down to a tolerance $\tau$, measured as $\tau=\lVert Ax_k - x_k \rVert=\lVert x_{k+1}- x_k \rVert$ can be estimated as $\frac{\log_{10}\tau}{\log_{10}\alpha}$. The authors provide an example: when $\tau=10^{-8}$ the Power method requires about 175 steps to converge for $\alpha=0.9$ but the iteration count rapidly grows to 1833 for $\alpha=0.99$. Therefore, for values of the damping parameter very close to 1 more robust alternatives to the simple Power algorithm should be used.