Insertion location

Insertion sorting (also insert sorting method or sorting by inserting , English insertion 'insert' and English sort 'sorting' ) is a simple, stable sorting method (i.e. the order of elements with the same key value remains unchanged). It is easy to implement, efficient with small or already partially sorted input quantities. In addition, Insertionsort does not require any additional storage space because the algorithm works in-place . Another advantage is that insertion sort can be used as an online algorithm .

Insertionort takes any element from the unsorted input sequence and inserts it at the right place in the (initially empty) output sequence. If you proceed in the order of the original sequence, the process is stable. If you are working on an array , the elements behind the newly inserted element must be moved. This is actually the complex operation of the insertion site. Finding the correct insertion position can be done comparatively efficiently using a binary search . Basically, however, insertion sorting works far less efficiently than other, more demanding sorting processes.

Problem Description

The procedure is comparable to sorting a sheet of playing cards. At the beginning the cards of the hand are face down on the table. The cards are revealed one by one and inserted in the correct position in the hand held. In order to find the insertion point for a new card, either the card is compared successively (from left to right) with the cards already sorted in the sheet, or a binary search is carried out. At any point in time, the cards in the hand are sorted and consist of the cards already removed from the table. To insert the new card, all of the following cards must move one position to the right.

input

A sequence of numbers to be sorted . ${\ displaystyle n}$ ${\ displaystyle \ left (a_ {1}, a_ {2}, \ ldots, a_ {n} \ right)}$

The numbers are also known as keys; these are often only part of a data set.

implementation

Pseudocode

The following pseudocode sorts the input sequence in ascending order. In order to sort in descending order, the second comparison in line 4 must be changed accordingly. The parameter Ais a field with the sequence unsorted at the beginning. After completion of the algorithm contains Athe elements A[0], A[1]..., A[n-1]the sorted sequence.
Please note that the indexing of the field begins with a 0. : Number of elements of : Index of the last element of
${\ displaystyle n}$ A
${\ displaystyle n-1}$ A

INSERTIONSORT(A)
1 for i = 1 to (Länge(A)-1) do
2      einzusortierender_wert = A[i]
3      j = i
4      while (j > 0) and (A[j-1] > einzusortierender_wert) do
5           A[j] = A[j - 1]
6           j = j − 1
7      end while
8      A[j] = einzusortierender_wert
9 end for

Structogram

The following is a Nassi-Shneiderman diagram (structogram) of the insertion location algorithm. The identifiers are based on the above pseudocode.

Count i from 1 to n-1
	value to be sorted = A [i]
	j = i
	As long as j> 0 and A [j-1]> value to be sorted
		A [j] = A [j-1]
		j = j - 1
	A [j] = value to be sorted

example

Execution of insertion sort on input field . The component to which the index points is colored red. Fields colored blue are in the already sorted subfield . ${\ displaystyle A [0..5]}$ ${\ displaystyle i}$ ${\ displaystyle A [1..i-1]}$

0	1	2	3	4th	5
5	2	4th	6th	1	3

Since an individual element is not subject to any ordering relation, the index starts at and the second element is compared with the first. ${\ displaystyle i = 1}$

0	1	2	3	4th	5
5	2	4th	6th	1	3

0	1	2	3	4th	5
2	5	4th	6th	1	3

The 5 slides back in the blue sorted sublist and the 2 is inserted at the beginning of this. The first two elements of the sequence are now sorted and the next element is checked ( ). ${\ displaystyle i = 2}$

0	1	2	3	4th	5
2	5	4th	6th	1	3

0	1	2	3	4th	5
2	4th	5	6th	1	3

There is nothing more to do with, as 6 is already in the correct position at the end of the sorted parts list. ${\ displaystyle i = 3}$

0	1	2	3	4th	5
2	4th	5	6th	1	3

In the penultimate step, the 1 is selected and inserted into the sorted list. In doing so, all previously sorted elements in the sorted list move one step back ( ). ${\ displaystyle i = 4}$

0	1	2	3	4th	5
2	4th	5	6th	1	3

0	1	2	3	4th	5
1	2	4th	5	6th	3

In the last step, the 3 is placed in the appropriate position in the sorted parts list ( ). ${\ displaystyle i = 5}$

0	1	2	3	4th	5
1	2	4th	5	6th	3

0	1	2	3	4th	5
1	2	3	4th	5	6th

All fields of the sequence are sorted according to the algorithm.

0	1	2	3	4th	5
1	2	3	4th	5	6th

complexity

The number of comparisons and shifts of the algorithm depends on the arrangement of the elements in the unsorted input sequence. For the average case, it is therefore difficult to estimate the runtime precisely, but it can be shown that the average case is in . In the best case , when the input array is already sorted, the complexity is linear ; H. even better than the more complicated methods ( quicksort , mergesort , heapsort, etc.). In the worst case it is square . ${\ displaystyle {\ mathcal {O}} (n ^ {2})}$ ${\ displaystyle {\ mathcal {O}} (n)}$ ${\ displaystyle {\ mathcal {O}} (n ^ {2})}$

If the binary search is used to determine the correct position of an element, one can determine the number of comparisons in the worst case

{\ displaystyle \ log (n!) \ in {\ mathcal {O}} (n \ log nn \ log e + \ log n) = {\ mathcal {O}} (n \ log n-0 {,} 4426n + \ log n)}

estimate; however, the stability of the sorting process may be lost in the process.

The number of shift operations in the average case is

{\ displaystyle n (n-1) / 4 \ in {\ mathcal {O}} (n ^ {2})}

.

The worst case is an array sorted in descending order , since each element is shifted from its original position to the first array position and shift operations are necessary. Their total number is thus ${\ displaystyle A}$ ${\ displaystyle j}$ ${\ displaystyle j-1}$

{\ displaystyle n (n-1) / 2 \ in {\ mathcal {O}} (n ^ {2})}

.

Further development

Donald L. Shell proposed a substantial improvement to this algorithm, which is now known as Shellsort . Instead of neighboring elements, elements that are separated by a certain distance are compared. This distance is reduced with each round. Because of the sorting over distance, the sorting method loses its “stable” property.

Robert Sedgewick published an optimized implementation of Insertionsort, which uses a Sentinel and only needs half of the swaps. This optimization is illustrated below using a “papyrus script function”. Float [] a is an example of an array with floating point numbers. The two integer parameters represent the flexible sorting area for the array (start value “L”, end value “R”). Assuming the array has 100 elements and starts at 1, then L = 1 and R = 100 must be set in order to sort it completely.

FUNCTION SortByInsert(Float[] a, Int L, Int R)
 1  bool bOK
 2  float X              ; Comparable v
 3  float f
 4
 5  int k = -1           ; original: k = 0  // counter of exchanges
 6  int i = R            ; original: R - 1
 7  X = a[i]             ; Sentinel
 8  WHILE (i > L)        ; TopDown loop
 9    f = a[i - 1]
10    IF (X < f)
11      a[i - 1] = X     ; exchange
12      a[i]     = f
13      k = i            ; original: k = k + 1
14    ELSE
15      X = f            ; no exchange/swap, update Sentinel only
16    ENDIF
17    i = i - 1
18  ENDWHILE
19
20  IF (k < 0)
21    RETURN             ; - STOP -  short circuit, no exchanges made
22  ENDIF
23  ; -------------------------- "insertion sort with half-exchanges"
24  i = L + 2
25  WHILE (i <= R)       ; original: (i < R)
26    X = a[i]           ; Sentinel
27    k = i              ; original: j = i  // counter for insertions
28    bOK = TRUE
29    WHILE (bOK)
30      f = a[k - 1]
31      IF (X < f)
32        a[k] = f
33        k = k - 1
34      ELSE
35        bOK = False
36      ENDIF
37    ENDWHILE
38    IF (k < i)
39      a[k] = X         ; original: a[j] = v
40    ENDIF
41    i = i + 1
42  ENDWHILE
ENDFUNCTION

Web links