Bucket location

Bucketsort (from English bucket " bucket ") is a sorting process that sorts an input list in linear time for certain value distributions . The algorithm is divided into three phases:

Distribution of the elements to the buckets (partitioning)
Each bucket is sorted using a further sorting process such as merge sort.
The content of the sorted buckets is concatenated .

The process works out-of-place .

algorithm

The entry of bucket sort is a list with elements and a function of each element of the list in the semi-open interval monotonically in the manner depicts that for sorting standard . If the sorting order is based on a comparison of binary data, the bits with the highest significance can be used . During sorting, the algorithm uses “buckets” arranged in an array . The elements are distributed via this array by placing each element in the -th bucket. Then each bucket is sorted one after the other. In the final phase, the bucket lists are concatenated in the order in which they are arranged in the array, which results in the sorted output. ${\ displaystyle l}$ ${\ displaystyle n}$ ${\ displaystyle f}$ ${\ displaystyle [0.1 [}$ ${\ displaystyle f (e) \ leq f (e ^ {\ prime})}$ ${\ displaystyle e}$ ${\ displaystyle \ prec e ^ {\ prime}}$ ${\ displaystyle \ prec}$ ${\ displaystyle k}$ ${\ displaystyle e}$ ${\ displaystyle \ lfloor f (e) \ cdot k \ rfloor}$

As a pseudo-code:

 bucket_sort(l, f, k)
   buckets = array(k)
   foreach (e in l)
     buckets[ floor(f(e) * k) ].add(e)
   r = []
   foreach (b in buckets)
     x = mergesort(b)
     r.append(x)
   return r

The algorithm sorts in a stable manner if the sort algorithm used to sort the buckets, here mergesort , is stable.

complexity

The distribution of the function values of determines the runtime of Bucketsort. The runtime is in (in O notation ), with the number of elements in the -th bucket. In the case of a uniform distribution , the total running time is in , since the sum over the buckets is linear and their summands can be viewed as constant (with an exact uniform distribution = 1). The efficient running time of is given not only for a uniform distribution, but also for all distributions according to which the sum term is asymptotically linear. It is also seen as the average case duration. ${\ displaystyle f}$ ${\ displaystyle {\ mathcal {O}} (n) + \ sum _ {i = 0} ^ {n-1} {\ mathcal {O}} (l_ {i} \ log l_ {i})}$ ${\ displaystyle l_ {i}}$ ${\ displaystyle i}$ ${\ displaystyle {\ mathcal {O}} (n)}$ ${\ displaystyle {\ mathcal {O}} (n)}$

In the case of other value distributions, the runtime of the bucket sorting algorithm can be dominated by the runtime of the sorting algorithm that is used to sort a bucket. Such a worst-case scenario occurs, for example, if all elements are assigned to a single bucket. When using mergesort to sort the buckets, the total runtime is then in . ${\ displaystyle {\ mathcal {O}} (n \ log n)}$

Of course, this second level sorting can again be implemented as a bucket sort, then with sub-buckets per bucket. This procedure is described in the article Radixsort and is a form of the MSD Radixsort.

The memory requirement is in . ${\ displaystyle {\ mathcal {O}} (n)}$

Individual evidence

↑ s. #Mealhorn .
But also an exhaustive calculation

${\ displaystyle n}$	Number of partitions	${\ displaystyle \ emptyset}$ Number of comparisons	${\ displaystyle \ emptyset / n}$
2	2	0.5000	0.25000
3	3	0.9630	0.32099
4th	5	1.4167	0.35417
5	7th	1.8675	0.37349
6th	11	2.3169	0.38616
7th	15th	2.7657	0.39510
8th	22nd	3.2140	0.40175
9	30th	3.6620	0.40689
10	42	4.1098	0.41098
11	56	4.5575	0.41432
12	77	5.0051	0.41709
13	101	5.4526	0.41943
14th	135	5.9000	0.42143
15th	176	6.3474	0.42316
16	231	6.7947	0.42467
17th	297	7.2420	0.42600
18th	385	7.6893	0.42718
19th	490	8.1366	0.42824
20th	627	8.5838	0.42919
21st	792	9.0310	0.43005
22nd	1002	9.4782	0.43083
23	1255	9.9254	0.43154
24	1575	10.373	0.43219
25th	1958	10,820	0.43279
26th	2436	11.267	0.43334
27	3010	11.714	0.43386
28	3718	12,161	0.43433
29	4565	12.608	0.43477
30th	5604	13.056	0.43518
31	6842	13.503	0.43557
32	8349	13,950	0.43593
33	10143	14.397	0.43627

over all permutations shows that up to an element number of on average less than comparisons are required for complete sorting. ${\ displaystyle n \ leq 33}$ ${\ displaystyle n / 2}$

literature

Kurt Mehlhorn , Peter Sanders : Algorithms and Data Structures. The Basic Toolbox . Springer, Berlin / Heidelberg 2008, ISBN 978-3-540-77977-3 , doi : 10.1007 / 978-3-540-77978-0 . 5.6 Breaking the Lower Bound
Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest , Clifford Stein: Introduction to Algorithms . 2nd Edition. MIT Press, Cambridge (Massachusetts) 2001, ISBN 0-262-03293-7 , pp. 174 .
Donald E. Knuth : The Art of Computer Programming : Volume 3 Sorting and Searching . 2nd Edition. Addison-Wesley, Reading MA 1997, ISBN 0-201-89685-0 , pp. 169 .
Apostolos Burnetas and Daniel Solow and Rishi Agarwal: An analysis and implementation of an efficient in-place bucket sort . In: Acta Informatica . tape 34 , no. 9 , 1997, pp. 687–700 , doi : 10.1007 / s002360050103 (The bucket sort variant is called groupsort).
EJ Isaac and RC Singleton: Sorting by Address Calculation . In: Journal of the ACM . tape 3 , no. 3 , July 1956, p. 169-174 , doi : 10.1145 / 320831.320834 .

[1] s. #Mealhorn .
But also an exhaustive calculation

${\ displaystyle n}$
Number of partitions ${\ displaystyle \ emptyset}$ Number of
comparisons ${\ displaystyle \ emptyset / n}$

2 2 0.5000 0.25000

3 3 0.9630 0.32099

4th 5 1.4167 0.35417

5 7th 1.8675 0.37349

6th 11 2.3169 0.38616

7th 15th 2.7657 0.39510

8th 22nd 3.2140 0.40175

9 30th 3.6620 0.40689

10 42 4.1098 0.41098

11 56 4.5575 0.41432

12 77 5.0051 0.41709

13 101 5.4526 0.41943

14th 135 5.9000 0.42143

15th 176 6.3474 0.42316

16 231 6.7947 0.42467

17th 297 7.2420 0.42600

18th 385 7.6893 0.42718

19th 490 8.1366 0.42824

20th 627 8.5838 0.42919

21st 792 9.0310 0.43005

22nd 1002 9.4782 0.43083

23 1255 9.9254 0.43154

24 1575 10.373 0.43219

25th 1958 10,820 0.43279

26th 2436 11.267 0.43334

27 3010 11.714 0.43386

28 3718 12,161 0.43433

29 4565 12.608 0.43477

30th 5604 13.056 0.43518

31 6842 13.503 0.43557

32 8349 13,950 0.43593

33 10143 14.397 0.43627

over all permutations shows that up to an element number of on average less than comparisons are required for complete sorting. ${\ displaystyle n \ leq 33}$ ${\ displaystyle n / 2}$

Bucket location

contents

algorithm

complexity

See also

Individual evidence

literature