Interpolation search
The interpolation search , also called interval search is one of the binary search derived search method that on lists and fields is used.
While the binary search algorithm always checks the middle element of the search space, the interpolation search algorithm tries to guess a more favorable division point than the middle in the search space. The way it works is comparable to that of a person looking for a word in a dictionary: the search for cylinder is usually started at the end of the dictionary, while the search for eel should begin at the front.
The algorithm
The interpolation search assumes sorted data. It is best suited for uniformly distributed data. Furthermore, random access to the elements is required. During the interpolation search, the data is divided depending on the key. If this has a high value, the element you are looking for is in the lower part of the data due to the presorting. Accordingly, the division is also carried out in the rear part of the data. With a small key, the field is split accordingly in the front part.
The division position can be calculated for all data by first dividing the number of all elements by the number of different elements, and then multiplying by the key you are looking for:
The position of the element searched for is thus interpolated by using the uniform distribution of the data to map the key to the list or field.
It can now be checked whether the key of the dividing element has a greater or lesser value than the key of the element sought. If the keys are identical, the search has already ended. If the dividing element has a smaller value, the right part is examined further, otherwise the left part. The number of elements and the number of different keys are determined for the new area, and then a new division position is interpolated.
example
A short practical example is intended to illustrate how the interpolation search works. To do this, the value 7 is searched for in the following elements:
2 | 4th | 7th | 9 | 12 | 21st | 26th | 31 | 37 |
Initially, the left (l) and right (r) boundaries are set on the boundaries of the array. Then the division element is calculated using the following formula:
So this gives for the array (red = search area, blue = x, bold = searched):
2 | 4th | 7th | 9 | 12 | 21st | 26th | 31 | 37 |
A check is then made to see whether the element found is the one sought. If this is the case, the process can be canceled, otherwise the search area is restricted. If the x is too small - you have to look to the right - the left limit is set to x + 1 and a search is made therein. Otherwise - you have to search left (or the x is too big) - the right limit is set to x − 1 and the search is now made in the left area.
Since the value A [1] = 4 is smaller than the element searched for, the left limit is set to l = x + 1 = 2 . The right limit remains and the following formula results:
The array now looks like this:
2 | 4th | 7th | 9 | 12 | 21st | 26th | 31 | 37 |
Since A [x] = A [2] = 7 = v , i.e. the element was found, it can be canceled and x returned as the solution after two steps.
complexity
An investigation of the interpolation search turns out to be very complex, but the running time ( n is the number of elements) can be assumed in the average case. In the worst case (the interpolated expected position is always at the edge), however, the running time is . The binary quadratic search solves this impairment .
Example implementations
VB.NET 2008
'Statt einer List(Of Integer) könnte auch IEnumerable(Of Integer), etc. verwendet werden. IEnumerable ermöglicht die Übergabe
'sowohl von generischen Listen, als auch Arrays
Public Function InterpolatedSearch(ByVal key As Integer, ByVal array As List(Of Integer)) As Integer
Dim left As Integer = 0
Dim right As Integer = array.Count - 1
Dim diffElem, pos As Integer
Do While (key >= array(left)) AndAlso (key <= array(right))
diffElem = array(right) - array(left)
pos = left + Math.Floor((right - left) * (key - array(left)) / diffElem)
If key > array(pos) Then
left = pos + 1
ElseIf key < array(pos)
right = pos - 1
Else
Return pos
End If
Loop
Return -1
End Function
Java
public int interpolierteSuche(int schluessel, int daten[]) {
int links = 0; // linke Teilfeldbegrenzung
int rechts = daten.length - 1; // rechte Teilfeldbegrenzung
int versch; // Anzahl verschiedener Elemente
int pos; // aktuelle Teilungsposition
// solange der Schlüssel im Bereich liegt (andernfalls ist das gesuchte
// Element nicht vorhanden)
while( schluessel >= daten[links] && schluessel <= daten[rechts] ){
// Aktualisierung der Anzahl der verschiedenen Elemente
versch = daten[rechts] - daten[links];
// Berechnung der neuen interpolierten Teilungsposition
pos = links + (int)(((double)rechts - links) * (schluessel - daten[links])
/ versch);
if( schluessel > daten[pos] ) // rechtes Teilintervall
links = pos + 1; // daten[pos] bereits überprüft
else if( schluessel < daten[pos] ) // linkes Teilintervall
rechts = pos - 1; // daten[pos] bereits überprüft
else // Element gefunden
return pos; // Position zurückgeben
}
return -1; // Element nicht gefunden
}
Delphi
type
TIntArray = array of integer;
function interpolierteSuche(schluessel: integer; var daten: TIntArray): integer;
var
links, rechts,
versch, aPos: integer;
begin
links := 0;
rechts := high(daten);
versch := 0;
aPos := 0;
while (schluessel >= daten[links]) and (schluessel <= daten[rechts]) do
begin
versch := daten[rechts] - daten[links];
aPos := links + round((rechts - links) * (schluessel - daten[links]) / versch);
if (schluessel > daten[aPos]) then
links := aPos + 1 else
if (schluessel < daten[aPos]) then
rechts := aPos - 1 else
begin
result := aPos;
exit;
end;
end;
result := - 1;
end;
C.
/**
* Liefert 1 zurück, wenn X in M gefunden wurde, ansonsten 0.
* Beim Aufruf wird als 4. Argument eine Variable per Adresse
* übergeben, in die bei Erfolg die Position von X in M geschrieben wird.
* @param const int[] M Feld, in dem gesucht werden soll
* @param int n Groesse des Feldes
* @param int X der gesuchte Eintrag
* @param int * index Position des gesuchten Eintrags X in M
* @return int 1=gefunden, 0=nicht gefunden
*/
int interpolation_search( const int M[], int n, int X, int *index )
{
double dx, dy;
double m; // Steigung
double b; // Y-Achsenabschnitt
int links = 0; // x1
int rechts = n-1; // x2
int pos; // vermutete Position
if ( M==NULL || X < M[0] || X > M[n-1] )
{
return 0;
}
while ( links <= rechts )
{
dx = rechts - links;
if ( dx == 1 )
{
if ( M[rechts] == X )
{
*index = rechts;
return 1;
}
else if ( M[links] == X )
{
*index = links;
return 1;
}
return 0;
}
if ( dx == 0 ) // 0 Division vorbeugen
{
return 0;
}
dy = M[rechts] - M[links];
if ( dy == 0 ) // keine Steigung
{
if ( M[links] == X )
{
*index = links;
return 1;
}
else
{
return 0;
}
}
m = dy / dx; // Steigung
b = (X - M[links]) / m;
pos = links + b; // Vermutete Position berechnen
if ( M[pos] == X )
{
*index = pos;
return 1;
}
else if ( M[pos] > X )
{
rechts = pos - 1;
}
else
{
links = pos + 1;
}
}
return 0;
}
See also
literature
- Robert Sedgewick: Algorithms in C . Addison-Wesley, Bonn 1992, ISBN 3-89319-376-6 , pp. 239-241.