Graph partitioning

from Wikipedia, the free encyclopedia
QS IT
This article was due to content flaws on the quality assurance side of the computer science editorial added. This is done in order to bring the quality of the articles from the subject area of ​​computer science to an acceptable level. Help to eliminate the shortcomings in this article and take part in the discussion !  ( + )


Reason: general understanding - V ¿ 22:37, 6 Aug 2013 (CEST)

Partitioned graph (2-partit)

Graph partitioning refers to the application of suitable algorithms to calculate graph partitions (see section (graph theory) ) with desired properties. A graph is called r-partit if there is a partition (division) of its nodes into r parts, so that the end corners of each edge of the graph are in different partition classes.

Graph partitioning in parallel programming

Formulation of the problem

Graph partitioning is mainly used in parallel processing: In order to be able to optimally utilize the advantages of a parallel system in a computationally intensive computer program, it must meet two criteria:

  1. The computing load must be evenly distributed across the computing units.
  2. The communication between the processing units that is necessary for processing the program should be kept as small as possible, since this requires immense execution time.

The optimization problem as a graph partitioning problem

This optimization problem can be formulated as a graph partitioning problem:

  • The individual calculation tasks in the program are modeled as nodes in a graph.
  • For every calculation that depends on the result of another calculation, the corresponding nodes are connected with an edge .
  • After partitioning, the computed subsets of the graph reflect the processors to which the tasks are to be distributed.

So the optimization problem is new: Find a partition of the given graph such that:

  1. The nodes are evenly distributed among the subsets.
  2. As few edges as possible Connect nodes from two different subsets.

Edges whose adjacent nodes are in different subsets are intersected by the partition and are therefore referred to as intersecting edges .

Weighted graph

The optimization problem can also be formulated for weighted graphs. This means that computational tasks of different intensity can be represented by different node weights . Also, by weighted edge data exchange are modeled by different amounts of data. The optimization problem is called more generally:

  1. Divide the knot weight evenly among the subsets and
  2. minimize the sum of the weights of the cut edges.

The sum of the weights of the cut edges is also referred to as the cutting size (English cutsize , edge-cut ). The above special formulation of the optimization problem is equivalent to this if all edges and nodes are given the weight 1.

example

In the figure above, an (unweighted) graph with six nodes and eight edges has been cut into two parts, each with three nodes. A subset is assigned to processor 1 and the other to processor two. Two edges are cut, which means a certain amount of communication effort. There is no other even distribution of the nodes that does not produce more than two cutting edges. So this partition is optimal.

Algorithms

Computing the optimal partition for a graph is an NP-complete problem. Therefore there are a number of proposed heuristics to find at least approximately optimal partitions in a short time.

These can be roughly broken down according to the approaches they are pursuing:

Recursive bisection

This method can be used in all of the algorithms mentioned below. It pursues the common divide-and-conquer principle and is that the graph only two subsets cut is. The resulting subgraphs are then recursively divided into two again until the desired number k of subsets is reached. This number must therefore be a power of two, i.e. H. for a (= recursion depth). Indeed, this is often the case in practice, e.g. B. in a parallel computer that contains processors.

By using recursive bisection one accepts a suboptimal solution in order to achieve a large gain in time.

Geometric algorithms

Geometric algorithms make use of coordinate information about the nodes. As such, a graph has no coordinates. In some areas of application, however, the graph is formed from a two- or three-dimensional network. This is e.g. This is the case, for example, when a real object is modeled by means of a network in a physical simulation, on which calculations are then to be carried out in each node. Most of these are only dependent on the results of the neighboring nodes, so that the network can be used directly as a graph for partitioning. The coordinate information of such networks then reflects the topology of the graph fairly well.

Examples of geometric algorithms:

  • Coordinate bisection: Choose the coordinate (e.g. x) in which the nodes are furthest apart, and for this a limit value c such that (x> c) applies to half of the nodes.
  • Inertial bisection : Calculate the inertial axis and choose this instead of a coordinate axis.

Spectral bisection

The idea of ​​spectral bisection is to formulate the (discrete) optimization problem mathematically in a continuous system of equations and to derive the solution analytically. Then one tries to discretely approximate the solution of the continuous system of equations.

Combinatorial Algorithms

There are a number of combinatorial algorithms for graph partitioning without coordinate information, which are only mentioned here by name:

Multilevel partitioning

Simple example of an ML partitioning (1 → 2: coarsening, 2 → 3: partitioning, 3 → 4: refinement)

The idea here is to use so-called matchings to shrink a large graph into a smaller one, which represents the global structure of the original one. This shrinkage ( coarsening ) is repeated several times until only a few (e.g. less than 100) nodes are left. Then the smallest graph is partitioned. The partitioning is calculated back to the next larger graph and there z. B. optimized by means of Kernighan-Lin (English refinement ), then counted back to the next larger graph until you have landed at the original graph. With this scheme, both the local and the global topology of the graph are taken into account, which leads to very good results.

Streaming algorithms

With this approach, the edges or nodes of the graph are read in in any order and distributed sequentially to the partitions using a heuristic function. With this approach, it is not necessary to keep the entire graph in memory, as in the case of the multilevel method. In addition, streaming algorithms have low latency due to their simple heuristics. If nodes are streamed, one speaks of edge cut, in the case of edges one speaks of vertex cut. Examples of streaming algorithms are:

  • HDRF
  • Greedy

A special approach is followed by the ADWISE algorithm. It is a streaming algorithm, but at the same time uses an edge buffer. This allows the algorithm to optimize its edge assignments within the framework of the buffer size. The runtime of the algorithm can also be influenced via the buffer size. The larger the buffer, the better the calculated partitioning, but this also leads to more latency.

Open source graph partitioning packages

  • KaHIP (Karlsruhe High Quality Partitioning).
  • kMetis.
  • Scotch.

literature

  • Aydin Buluc, Ilya Safro, Henning Meyerhenke, Peter Sanders, Christian Schulz: Recent Advances in Graph Partitioning . 2013, arxiv : 1311.3144 .
  • Christian Schulz: High Quality Graph Partitioning . epubli GmbH, Berlin 2013, ISBN 978-3-8442-6462-3 ( online [PDF] dissertation, Karlsruhe Institute of Technology).
  • U. Elsner: Graph partitioning - a survey . 1997 (English).

Individual evidence

  1. ^ Diestel, Reinhard .: Graph theory . 3., rework. and exp. Springer, Berlin 2006, ISBN 3-540-21391-0 , pp. 18 .
  2. Christian Mayer, Ruben Mayer, Muhammad Adnan Tariq, Heiko Geppert, Larissa Laich, Lukas Rieger, Kurt Rothermel: ADWISE: Adaptive Window-based Streaming Edge Partitioning for High-Speed ​​Graph Processing. (PDF) University of Stuttgart, accessed on July 27, 2018 (English).
  3. ^ Christian Schulz: High Quality Graph Partitioning. Dissertation. Karlsruhe Institute of Technology, 2013.
  4. ^ Peter Sanders, Christian Schulz: Engineering Multilevel Graph Partitioning Algorithms. In: Proceedings of the 19th European Symposium on Algorithms (ESA'11). volume 6942 of LNCS, pages 469--480. Springer, 2011.
  5. ^ Peter Sanders, Christian Schulz: Distributed Evolutionary Graph Partitioning. In: Proceedings of the 12th Workshop on Algorithm Engineering and Experimentation (ALENEX'12). Pp. 16-19, 2012.
  6. ^ Peter Sanders, Christian Schulz: High Quality Graph Partitioning. In: Proceedings of the 10th DIMACS Implementation Challenge Workshop: Graph Partitioning and Graph Clustering. Pp. 1–17, AMS, 2013.
  7. http://algo2.iti.kit.edu/documents/kahip/index.html
  8. G. Karypis, V. Kumar: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing 20 (1): p. 359. (1999)
  9. http://glaros.dtc.umn.edu/gkhome/metis/metis/overview
  10. http://www.labri.fr/perso/pelegrin/scotch/