List ranking

List Ranking describes the task of assigning the elements of a linked list to their rank within the list. The rank of a list element is defined as the distance to the last list element.

Determining the ranks is necessary in order to convert lists into arrays without losing the order of the list elements , on which many parallel operations can be performed more efficiently. The sequential algorithm for solving the task is trivial with an effort in , parallel algorithms, on the other hand, are significantly more complex and difficult to implement, but achieve runtimes of , such as the Independent Set Removal algorithm presented in this article . ${\ displaystyle O (n)}$ ${\ displaystyle O (\ log n \ log \ log n)}$

This article is limited to singly linked lists , and the last element of a list should be identified by the fact that it refers to itself as the next element.

Sequential algorithm for singly linked lists

The sequential algorithm , also known as pointer chasing , runs through the list once from the first element to the last, and numbers the elements. In a second run, the rank is then determined from the numbering and the list length. The runtime for this procedure is in , where the length of the list is designated (see: Landau symbols ). ${\ displaystyle O (n)}$ ${\ displaystyle n}$

 1 //Sei s das erste Element der Liste.
 2 e := s
 3 d := 0
 4 e.rank := 0
 5 while e.next != e
 6    e := e.next
 7    d := d + 1
 8    e.rank:= d
 9 e := s
10 e.rank := d
11 while e.next != e
12    e := e.next
13    e.rank := d - e.rank

Parallel algorithms for list ranking

The parallel algorithms for the list ranking problem are designed and analyzed in the PRAM model . To enable parallel processing of the list, it must be assumed that the elements of the list can already be accessed via an array, but the array does not contain any information about the position of the list elements within the list.

Doubling

The doubling algorithm determines their rank for all list elements in parallel, but in contrast to the sequential algorithm, the list is not run through to the end for each element, but abbreviations are used that the processors work out together.

 1 //Seien e[i], i = 1,...,n, die Elemente der Liste.
 2 //Auf Prozessoren p_1, ..., p_n parallel:
 3 if e[p_i].next == e[p_i]
 4    e[p_i].rank := 0
 5 else
 6    e[p_i].rank := 1
 7 end if
 8 while e[p_i].next.next != e[p_i].next
 9    e[p_i].rank := e[p_i].rank + e[p_i].next.rank
10    e[p_i].next := e[p_i].next.next

Example for the doubling algorithm

The abbreviations are set up in the last line of the algorithm by overwriting the pointers to the respective successor of the elements with pointers to the successor of the successor. The lengths of these abbreviations are always doubled in each iteration, apart from the last iteration. In addition to the abbreviation, the length of the abbreviation is also stored. If the abbreviation refers to the last list element, the length of the abbreviation is precisely the rank of the list element.

Since the algorithm needs steps to calculate the rank of the first element by doubling the lengths of the abbreviations , the total runtime of the algorithm is also in . The efficiency of the algorithm is . ${\ displaystyle \ Theta (\ log n)}$ ${\ displaystyle \ Theta (\ log n)}$ ${\ displaystyle 1 / \ log n}$

The algorithm can be generalized to general processor numbers by assigning elements to each processor , which it then processes. The runtime is then in . ${\ displaystyle p}$ ${\ displaystyle n / p}$ ${\ displaystyle O ({\ frac {n} {p}} \ log n)}$

Independent set removal

The Independent Set Removal Algorithm can be used for any number of processors and is an improvement over the generalized doubling algorithm.

First, the list is divided into active and inactive elements, the doubling algorithm is then only applied to the active elements, then the rank of the inactive elements is determined from the already known ranks. This means that the Independent Set Removal Algorithm has the advantage that doubling no longer has to be carried out for every element of the list.

The ranks are initialized as with the doubling algorithm:

1 //Seien e[i], i = 1, ..., n, die Elemente der Liste
2 //Auf Prozessoren p_1, ..., p_q parallel:
3 if e[i].next == e[i]
4    e[i].rank := 0
5 else
6    e[i].rank := 1
7 end if

The aim is to select at most active elements from the list , with , and then to apply the doubling method to the active elements. In order for the remaining ranks to be determined, independent sets are needed: ${\ displaystyle m}$ ${\ displaystyle m <n}$

An independent subset of list elements satisfies the property: . For an element of the independent set, the successor of the element is not included in the independent set. Since the successor of the last element is itself, it is never contained in an independent set. If an independent set is marked as inactive, each inactive element has an active successor. If the ranks of the active elements are then determined by means of doubling, the ranks of the inactive elements can be determined by simply calling up the rank of the active successor. ${\ displaystyle I}$ ${\ displaystyle e \ in I \ Rightarrow e.next \ notin I}$

However, it is generally not enough to determine a single independent set and mark it as inactive to ensure that at most elements are active: Neighboring elements are never in an independent subset, so an independent subset can contain at most half of the list elements. ${\ displaystyle m}$

Independent subsets must be repeatedly removed from the list until at most elements are still active. Removing an element is comparable to determining an abbreviation in the doubling algorithm: the length of the abbreviation must be saved in the predecessor of the element to be removed. ${\ displaystyle m}$

1 //Sei I eine unabhängige Menge.
2 //Für alle e[i] nicht in I parallel auf q Prozessoren:
3 if e[i].next enthalten in I
4    //e[i].next soll aus der Liste entfernt werden
5    e[i].rank := e[i].rank + e[i].next.rank
6    e[i].next := e[i].next.next

Example for the Independent Set Removal Algorithm

An important observation is that all active elements store the distance to the next active element in their rank variable. This property persists when independent subsets are repeatedly removed.

If recursively independent subsets are removed in this way until the doubling algorithm can be executed, active elements that know their rank and inactive elements that, when they were still active, knew their distance to the next active element, are obtained after execution. If the subsets of inactive elements are now reinserted into the list in the reverse order in which they were removed, these elements continue to know the distance to the next active element, whereby the rank is achieved by adding the distance to the next active element and its Rank can be determined.

The following procedure can be used to determine an independent subset: it is probable that each element of the list is marked as inactive. For each inactive element, a parallel check is carried out to determine whether the successor is also inactive, and if so, it is marked as active. The independent subset obtained in this way contains the entire list as expected . Although this approach is inefficient, this approach outperforms all alternatives because of the extremely low cost of each operation. ${\ displaystyle 1/2}$ ${\ displaystyle 1/4}$

The full algorithm:

Eingabe: Liste der Länge n
1. Initialisiere die Ränge
2. R := 0
   while mehr als m Elemente in der Liste
      R := R + 1
      Wähle eine unabhängige Teilmenge
      Entferne die unabhängige Teilmenge
3. Führe Doubling Algorithmus für die restlichen Elemente auf q Prozessoren aus
4. for i := R to 1
      Füge die Elemente ein, die bei Durchlauf i entfernt wurden
      Bestimme die Ränge der wieder hinzugefügten Elemente

It can be shown that , with this now denote the number of processors is valid for the expected life: . With results and an efficiency of 1. ${\ displaystyle m = p}$ ${\ displaystyle p}$ ${\ displaystyle T (n, p) \ in O ({\ frac {n} {p}} + \ log {\ frac {n} {p}} \ log p)}$ ${\ displaystyle p = {\ frac {n} {\ log n \ log \ log n}}}$ ${\ displaystyle T (n, p) \ in O (\ log n \ log \ log n)}$

Jájá proves in his book for a similar algorithm, in which the independent set is determined deterministically, a running time of , for the choice of , with an efficiency of . ${\ displaystyle O (\ log n \ log \ log n)}$ ${\ displaystyle m = p = n / \ log n}$ ${\ displaystyle {\ frac {1} {\ log \ log n}}}$

Outside the PRAM model

The algorithms described benefit on the one hand from the synchronized execution in the PRAM model, which is not the case with real machines, and on the other hand from the global memory of the PRAM model. However, it is very difficult to get good results on real machines with a connection network because the list ranking problem is an extremely non-local task. This means that it is not clear in advance which data a processor has to request from the network in order to then do local work, since only by processing a list element does it become clear which element the processor will need next, which leads to a lot of slow accesses on the memory.

Individual evidence

↑ Jop F. Sibeyn: Better trade-offs for parallel list ranking. In: Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures. 1997, pp. 221-222.
↑ Peter Sanders, Michael Ikkert, Tim Kieritz, Parallel Algorithms. Lecture notes on parallel algorithms at the Karlsruhe Institute of Technology. Accessed March 31, 2019. p. 106.
↑ Jop F. Sibeyn, Frank Guillaume, Tillmann Seidel: Practical Parallel List Ranking. In: Journal of Parallel and Distributed Computing. Volume 56, Issue 2, 1999, p. 162.
↑ Peter Sanders, Michael Ikkert, Tim Kieritz, Parallel Algorithms. Lecture notes on parallel algorithms at the Karlsruhe Institute of Technology. Accessed March 31, 2019. pp. 107 + 108.
↑ J. Jaja: An Introduction to Parallel Algorithms. 1992, p. 97.
↑ Jop F. Sibeyn, Frank Guillaume, Tillmann Seidel: Practical Parallel List Ranking. In: Journal of Parallel and Distributed Computing. Volume 56, Issue 2, 1999, p. 157.

[1] Jop F. Sibeyn: Better trade-offs for parallel list ranking. In: Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures. 1997, pp. 221-222.

[2] Peter Sanders, Michael Ikkert, Tim Kieritz, Parallel Algorithms. Lecture notes on parallel algorithms at the Karlsruhe Institute of Technology. Accessed March 31, 2019. p. 106.

[3] Jop F. Sibeyn, Frank Guillaume, Tillmann Seidel: Practical Parallel List Ranking. In: Journal of Parallel and Distributed Computing. Volume 56, Issue 2, 1999, p. 162.

[4] Peter Sanders, Michael Ikkert, Tim Kieritz, Parallel Algorithms. Lecture notes on parallel algorithms at the Karlsruhe Institute of Technology. Accessed March 31, 2019. pp. 107 + 108.

[5] J. Jaja: An Introduction to Parallel Algorithms. 1992, p. 97.

[6] Jop F. Sibeyn, Frank Guillaume, Tillmann Seidel: Practical Parallel List Ranking. In: Journal of Parallel and Distributed Computing. Volume 56, Issue 2, 1999, p. 157.