Search procedure

The computer science designated search method or search algorithm an algorithm , which in a search space looks for patterns or objects with certain properties. A distinction is made between simple and heuristic search algorithms. Simple search algorithms use intuitive methods for searching the search space, while heuristic search algorithms incorporate knowledge about the search space (e.g. data distribution) in order to reduce the search time required.

The solution of an algorithmic problem can generally be understood as a search for the solution in a set of possible solutions (the solution space ). The solution can be the target state, but also the path to the target or the sequence of corresponding actions. If the search space is finite, the search with a suitable search strategy can always lead to a result. In the case of infinite (solution) quantities , the search must be terminated according to certain criteria (e.g. after a certain time).

Repeated searches in a finite set can be made efficient by creating an index structure (e.g. in the form of a search tree ) over the data , which is sorted according to a specific criterion. Then you no longer have to look at all entries during a search (e.g. you start the search in a telephone book with the letter with which the name begins).

Simple search algorithms

Simple search algorithms neglect the special nature of the particular problem. They can therefore general and abstract implemented are making the same implementation can be used for a wide range of problems. The disadvantage of simple search algorithms is the resulting costs: The search space for search problems is generally very large, but simple searches only run within a reasonable time in small search spaces.

Search in lists

List search algorithms are the simplest search algorithms of all. The aim of searching in lists is to find a particular element of a list for which the associated search key is known. Since this problem is often encountered in computer science, the algorithms used - and their complexity - have been very well investigated.

Search in a skip list

The simplest search algorithm for lists is linear search . It runs through one element after the other until an element with the key you are looking for is found. The linear search has a running time of ( n is the number of elements in the list) and can be used on both sorted and unsorted lists. An advanced technique is binary search with a duration of . For large lists it is much more efficient than linear search, but it does require that the list has been sorted beforehand and that the elements can be accessed randomly . The interpolation search, also called interval search, is an improvement on the binary search, which requires an even distribution of the data. The runtime is only better than that of the binary search for very large amounts of data. Another search algorithm for lists is the Grover algorithm , which is used on quantum computers and is quadratically faster than the classic linear search for unsorted lists. Hashing can also be used to search in lists , which on average requires a constant time , but in the worst case , linear time. ${\ displaystyle {\ mathcal {O}} (n)}$ ${\ displaystyle {\ mathcal {O}} (\ log n)}$ ${\ displaystyle {\ mathcal {O}} (\ log (\ log n))}$ ${\ displaystyle {\ mathcal {O}} (1)}$

Search in trees

A binary search tree of size nine and depth three

The search in trees is the supreme discipline among the search algorithms. It searches nodes of trees regardless of whether the tree is explicit or implicit (generated during the search). The following principle is used: A node is taken from a data structure . Its child nodes are examined and, if necessary, added to the data structure. Depending on the selection of the data structure, the tree can be searched in different orders. Using a queue thus leads to a breadth-first search , traversing the tree level by level. When using a stack, on the other hand, the system searches up to one leaf and only then continues with the next child node. This is known as depth-first search .

Search in graphs

Many problems in graph theory can be solved with the help of search algorithms. Examples of these problems are the traveling salesman problem , the computation of shortest paths, and the construction of a minimal spanning tree . The corresponding algorithms are, for example, Kruskal's algorithm , Dijkstra's algorithm or Prim's algorithm , which can be seen as extensions of the algorithms for searching in trees.

Heuristic (informed) search algorithms

Strategies that can accelerate the process of finding solutions are called heuristics . Typical heuristics are rules of thumb, orientation on examples and the simulation of the human problem-solving process. Accordingly, the procedures can be differentiated into uninformed (also called blind search) and informed (use of heuristics). The study of different methods for heuristic search, the development and implementation of new methods and their application to different problem areas are usually included in the algorithmic core of artificial intelligence. These include B. automatic proofing, control of robots and, as typical representatives, especially games. Both two-person games (zero-sum games with complete information) such as B. chess , checkers , mills and one-person games such as sliding puzzles or solitaire. The classic methods for heuristic search are A * , IDA * , bidirectional search schemes , the minimax method , alpha-beta search .

Heuristic search algorithms are also used when an algorithm for problem solving is too computationally intensive. In this case, a certain error is accepted - that is, a suboptimal solution is also accepted - if the computing time used can be significantly reduced.

Optimizing search

This type of search solves optimization tasks in which a number of variables must be assigned values. Since there can be a large number of variables with a very large range of values, the search range is very large and conventional search methods fail.

Combinatorial search and backtracking are methods that are used in optimizing searches, especially for discrete variables.

For the analog search for minima or maxima of multidimensional functions, there are a number of numerical optimization methods that are used depending on the respective starting conditions.

Another procedure is based on the feedback from the user, who has to evaluate the relevance of the results, see relevance feedback .

Other search methods

Search methods for character strings search for the occurrence of a key in character strings. Well-known representatives are the algorithm by Knuth-Morris-Pratt , the algorithm by Boyer-Moore and the Karp-Rabin algorithm . See also: String Matching Algorithm .

Evolutionary algorithms use ideas from evolutionary theory as heuristics to get good results faster.

Simulated annealing ( simulated annealing ) is a value based on probability search algorithm.

Adversarial Search is used in the field of artificial intelligence .

Differences in performance of the search process

In the no-free lunch theorems it was shown that - averaged over all mathematically formulated problems - all search methods are equally good. A search method only shows a performance advantage on a specific class of problems.

Web links

R. Poli, WB Langdon, NF McPhee: A Field Guide to Genetic Programming. Lulu.com, 2008, ISBN 978-1-4092-0073-4 . (researchgate.net)