C C Parallel Sparse BLAS version 3.0 C (C) Copyright 2006, 2007, 2008, 2009, 2010 C Salvatore Filippone University of Rome Tor Vergata C Alfredo Buttari CNRS-IRIT, Toulouse C C Redistribution and use in source and binary forms, with or without C modification, are permitted provided that the following conditions C are met: C 1. Redistributions of source code must retain the above copyright C notice, this list of conditions and the following disclaimer. C 2. Redistributions in binary form must reproduce the above copyright C notice, this list of conditions, and the following disclaimer in the C documentation and/or other materials provided with the distribution. C 3. The name of the PSBLAS group or the names of its contributors may C not be used to endorse or promote products derived from this C software without specific written permission. C C THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS C ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED C TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR C PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE PSBLAS GROUP OR ITS CONTRIBUTORS C BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR C CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF C SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS C INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN C CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) C ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE C POSSIBILITY OF SUCH DAMAGE. C C *********************************************************************** * * * The communication step among processors at each * * matrix-vector product is a variable all-to-all * * collective communication that we reimplement * * in terms of point-to-point communications. * * The data in input is a list of dependencies: * * for each node a list of all the nodes it has to * * communicate with. The lists are guaranteed to be * * symmetric, i.e. for each pair (I,J) there is a * * pair (J,I). The idea is to organize the ordering * * so that at each communication step as many * * processors as possible are communicating at the * * same time, i.e. a step is defined by the fact * * that all edges (I,J) in it have no common node. * * * * Formulation of the problem is: * * Given an undirected graph (forest): * * Find the shortest series of steps to cancel all * * graph edges, where at each step all edges belonging * * to a matching in the graph are canceled. * * * * An obvious lower bound to the optimum number of steps * * is the largest degree of any node in the graph. * * * * The algorithm proceeds as follows: * * 1. Build a list of all edges, e.g. copy the * * dependencies lists keeping only (I,J) with I