Partitioning problem

The partitioning problem is a mathematical problem that has gained great importance in computer science due to its NP severity . It is concerned with finding an algorithm with which all possible representations as the sum of natural numbers (the number partitions of the initial number) can be determined for any natural number .

→ The number of representations of a natural number as the sum of natural numbers is defined as the value of the partition function of . Formulas for calculating the partition function and mathematical applications of the number partitions are described in the article Partition function. The present article describes the question as a problem of theoretical computer science and complexity theory and the application of number partitions to problems in computer science and technology. ${\ displaystyle N}$ ${\ displaystyle N, \, P (N)}$

Explanation

How many ways are there to write a natural number as the sum of natural numbers?

The answer to the number 4 is simple.

4 = 1 + 1 + 1 + 1
4 = 2 + 1 + 1
4 = 2 + 2
4 = 3 + 1
4 = 4

For the division of the number 6, starting with the division into all ones 1 + 1 + 1 + 1 + 1 + 1 = 6 through 2 + 2 + 2 = 6 to 6 = 6, there are exactly 11 possibilities. The Indian mathematician S. Ramanujan came up with a large number of variants very quickly when he calculated a list of the decomposition of all numbers up to 200. The number 200 can be broken down into exactly 3972999029388 addition variants.

Applications in computer science and technology

The meaning of the partitioning problem lies only indirectly in the partitioning of a number into smaller units. Rather, a problem can be described by reversing the partitioning.

Introductory example

When a group of children wants to team up to play a game, they usually vote. There is a pool of potential players on the sidelines, and the team captains of both teams take turns voting until each participant has been assigned to a team. The point of this ritual is that in the end teams of about the same strength compete against each other.

Formally, the problem could be specified as follows:

There are a lot of players out there.
The players have different skills. This could be generalized as a skill level on a scale from 1 to 10.
We are looking for a division into 2 groups in which the playing strengths of the individual players are the same in both groups.

The previous description could be mapped onto the partitioning problem by adding the skill levels of all players to the value . It is therefore clear that the group as well as the group must each have a skill level of so that the game is balanced. ${\ displaystyle X}$ ${\ displaystyle A}$ ${\ displaystyle B}$ ${\ displaystyle {\ frac {X} {2}}}$

It quickly becomes clear that there is a simple algorithm to approach a solution to the problem by alternately choosing the best remaining player. It can also be seen that there may not be a perfect solution to the problem.

This becomes unacceptable when giving change at a ticket machine. However, an approximate solution to the problem is often helpful in order to rule out highly unfavorable combinations.

Parallel processing in computers

An important area in which the partitioning problem comes into play is the distribution of computing tasks in multiprocessor systems. The problem is to determine the exact number of arithmetic tasks with the same runtime when the CPUs or cores are available . ${\ displaystyle n}$ ${\ displaystyle n}$

General: graph partitioning

In order to be able to optimally utilize the advantages of a parallel system in a computationally intensive computer program, it must meet two criteria:

The computing load must be evenly distributed across the computing units.
The communication between the processing units that is necessary for processing the program should be kept as small as possible, since this requires immense execution time.

Parallel search algorithms

A classic problem for artificial intelligence is the efficient processing of search trees. The tasks given thereby include a. the search for the shortest paths, the solution of the constraint satisfaction problem or the solution of discrete optimization considered here. The tree search method known from operations research " Branch-and-Bound " is often used for optimization . In Artificial Intelligence it is known as the A * algorithm .

Typically, the discrete optimization problems such as B. the problem of the traveling salesman, the backpack problem or the machine occupancy planning, characterized by high complexity and thus by long calculation times. Since the technical development of sequential computers is increasingly reaching its physical limits, the use of massive computing parallelism is a promising alternative. In the case of parallel computers, however, a balanced employment of the processors must be guaranteed to maintain efficiency. Especially for fine-grained parallel computers with several 10,000 processor elements, the load distribution of the computational work required on the processors is a key problem.

Partitioning in databases

In addition to the transparent operation of a logical database on several computer nodes, the data stock is also partitioned , especially in the area of the data warehouse . Logically related data in a table are sorted into groups and fragmented horizontally. This is advantageous for processing search queries, since the query usually shows directly which partial partitions must be used to process the query. Other parts of the logical table are then no longer searched. The concept can be extended by vertical fragmentation , in which the columns of a table are grouped and fragmented, as well as the distribution of the individual fragments to different computer nodes.

Suboptimal fragmentation can lead to a moderate gain in performance. Dividing the data sets with the highest query frequency into smaller partitions is ideal for efficient network planning. Partitioning into active and passive data can also be used for archiving purposes.

Example: Orders table

Partition 1: years 2000-2005
Partition 2: years 2005-2007
Partition 3: year 2008

Use in networks

The principle of load distribution is essential in many areas, for example for the operation of large databases, server applications or the operation and planning of network infrastructure.

DNS name resolution

Even medium-sized websites often cause too many inquiries to be answered by a computer system. It has been shown that the dynamic resolution of domain names to changing IP addresses is an effective method for redirecting user inquiries to different computer systems that are similar in their function.

The concept of the Round Robin DNS has proven to be an inexpensive solution. It combines low administrative effort with sufficient benefits. Thanks to the skillful use of IPs, it can continue to work without any problems even in the event of a failure, without having to intervene in the DNS system itself. The round robin is set for example via the name www.bildungsmarkt-sachsen.de and this is mapped to the IPs that are assigned the names bms1.hrz.tu-chemnitz.de and bms2.hrz.tu-chemnitz.de.

Planning of radio networks

In order to operate a wireless network, it is important to weigh the achieved granularity of network coverage with the costs involved.

Formally, the problem is to find a partitioning for an area so that the resulting regions have the highest granularity with the number of available sensors or the desired granularity with as few sensors as possible. ${\ displaystyle A}$

Various parameters must be taken into account for an optimal decision. These are added to a (variable) value using a linear weighting function, which is used for partitioning.

It is also possible that some regions of the network are more important than others and therefore need to be equipped with more sensors (so-called hot spots). If more sensors are installed than are required, there are fewer bottlenecks and a higher failure tolerance in the event of malfunctioning individual sensors. In addition, energy efficiency is becoming increasingly important. Through clever optimization, networks can switch off individual sensors in times of low utilization if their coverage area overlaps in such a way that it is also supplied by neighboring sensors.

Individual evidence

↑ Brian Hayes: The easiest hard problem ( Memento of 10 October 2008 at the Internet Archive ). American Scientist. April 2002

↑ Dominik Heinrich . 1995

^ Education market in Saxony ( Memento of the original from December 18, 2008 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2

[1] Brian Hayes: The easiest hard problem ( Memento of 10 October 2008 at the Internet Archive ). American Scientist. April 2002

[2] Dominik Heinrich . 1995