Version space

from Wikipedia, the free encyclopedia

In machine learning, the version space is that subset of the hypothesis space that contains all consistent and complete hypotheses with respect to a set of learning examples. A hypothesis is called consistent if it does not classify any negative training examples as positive. A hypothesis is called complete if all positive examples of a hypothesis are correctly classified.

The version space learning method (Mitchell 1982) is an incremental machine learning method for learning a concept . In the event that the training examples are not noisy and the desired target concept is contained in the hypothesis space, the version space learning method provides a compact representation of the version space.

Generality in the hypothesis space

The algorithm is based on a partial order that allows hypotheses to be differentiated according to generality. A hypothesis is called more special than if the following applies to all x from the set of possible target concepts:

Version space learning procedure

The version space learning method is a machine method in the field of AI to teach the computer to correctly assess previously unknown information.

algorithm

Initially, the version space contains all possible hypotheses, i.e. it matches the hypothesis space. By sequentially adding positive and negative training examples, it is restricted further and further until, ideally, it only consists of one element. The version space is represented by two sets named S and G ("specific" and "general"). S is the set of the most special hypotheses and contains all hypotheses that are consistent with the training examples, i.e. classify them correctly. Furthermore, none of the hypotheses in S must be more general than another hypothesis in the version space. Similarly, G contains the most general hypotheses that are consistent with the training data.

Initially, S contains the most specific hypothesis, i.e. the hypothesis that classifies each target concept negatively, and G the most general hypothesis, i.e. the hypothesis that classifies each target concept positively. Then the set of all training examples is iterated over and S and G are adjusted so that the above requirements for S and G are met.

advantages and disadvantages

The first advantage of the version space learning method is the implicit representation of the version space. Old examples do not have to be saved, which means that there is less memory overhead to display the version space. Another advantage is the ability to independently recognize a sufficiently large number of training examples (stop if S = G). The speed of learning is increased if hypotheses can be generated and added to S or G, for example created by experts. In this case, the algorithm can select examples that separate the version space into parts of the same size as possible. Learning such an example ensures a quick reduction in the version space size.

example

The example demonstrates how a specific version space is created using examples.

Before the examples are classified in the version space, the quantities and are initially assigned .

Start allocation

Positive example

  • (Football, team, outside, national, Saturday)
  • Football, team, outside, national, Saturday
  • ?,?,?,?,?

Explanation

does not contain the example . generalized around . still allows all examples.

Positive example

  • (Hockey, team, outside, national, Saturday)
  • ?, Team, outside, national, Saturday
  • ?,?,?,?,?

Explanation

does not include the new example . Therefore it is generalized to include. As and differ only in the sport, replacing football through the wildcard symbol ?

Negative example

  • (Floor exercise, single, indoor, world, Saturday)
  • ?, Team, outside, national, Saturday
  • (?, Team,?,?,?), (?,?, Outside,?,?), (?,?,?, National,?)

Explanation

does not contain the negative example, therefore remains unchanged. must be specialized by listing all cases that prevent it from being recognized as a valid example. At the same time, it must be general enough that the previous examples allow it.

Positive example

  • (Handball, team, indoor, national, Saturday)
  • ?, Team,?, National, Saturday
  • (?, Team,?,?,?), (?,?,?, National,?)

Explanation

does not contain the current example and must therefore be expanded. would reject the current example, so you have to specialize.

Negative example

  • (Decathlon, singles, outside, world, Sunday)
  • ?, Team,?, National, Saturday
  • (?, Team,?,?,?), (?,?,?, National,?)

Explanation

As the example rejects is . Also the example does not allow, that is .

literature

Individual evidence

  1. Example of a concept learning task. (No longer available online.) Archived from the original on March 4, 2016 ; accessed on June 16, 2016 . Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2Template: Webachiv / IABot / www2.inf.fh-rhein-sieg.de