Boltzmann machine

A graphic representation of a Boltzmann machine with 3 hidden units and 4 visible units.

A Boltzmann machine is a stochastic artificial neural network that was developed by Geoffrey Hinton and Terrence J. Sejnowski in 1985. These networks are named after the Boltzmann distribution . Boltzmann machines with no connection restrictions are very difficult to train. However, if the connections between the neurons are restricted, the learning process can be greatly simplified, whereby Limited Boltzmann machines can be used to solve practical problems.

construction

Like a Hopfield network, a Boltzmann machine is a network of neurons in which an energy level is defined. As with Hopfield networks, the neurons only take on binary values (0 or 1), but in contrast behave stochastically. The energy level of a Boltzmann machine is defined as in a Hopfield network: ${\ displaystyle E}$

{\ displaystyle E = - \ left (\ sum _ {i <j} w_ {ij} \, s_ {i} \, s_ {j} + \ sum _ {i} \ theta _ {i} \, s_ { i} \ right)}

where:

${\ displaystyle w_ {ij}}$ is the weight of the connection between neuron and . ${\ displaystyle i}$ ${\ displaystyle j}$
${\ displaystyle s_ {i}}$ is the state of the neuron . ${\ displaystyle s_ {i} \ in \ {0,1 \}}$ ${\ displaystyle i}$
${\ displaystyle \ theta _ {i}}$ is the threshold of a neuron . ( is the value from which a neuron is activated.) ${\ displaystyle i}$ ${\ displaystyle - \ theta _ {i}}$

The connections of a Boltzmann machine have two limitations:

${\ displaystyle w_ {ii} = 0 \ qquad \ forall i}$ . (No neuron has a connection with itself.)
${\ displaystyle w_ {ij} = w_ {ji} \ qquad \ forall i, j}$ . (All connections are symmetrical.)

The weightings can be represented in the form of a symmetrical matrix , the main diagonal of which consists of zeros. ${\ displaystyle W}$

Just as with the Hopfield network, the Boltzmann machine tends to reduce the value of the energy defined in this way with successive updates, i.e. ultimately to minimize it until a stable state is reached.

Restricted Boltzmann machine

A so-called Restricted Boltzmann Machine (RBM) consists of visible units and hidden units. The feature vector is applied to the hidden units.

The "restricted" comes from the fact that the visible units are not connected to each other and the hidden units are not connected to each other. However, the visible units are fully connected to the hidden units. So they form a bipartite, undirected graph. This is illustrated below:

The parameters to be learned are the weights of the edges between visible and hidden units and the bias vectors of the hidden and visible units. These are learned using the Contrastive Divergence Algorithm. ${\ displaystyle b_ {h}, b_ {v}}$

Restricted Boltzmann Machines were used for collaborative filtering on Netflix .

Web links

Individual evidence

↑ Ackley, David H .; Hinton, Geoffrey E .; Sejnowski, Terrence J. (1985). "A Learning Algorithm for Boltzmann Machines" (PDF). Cognitive Science 9 (1): 147-169. doi : 10.1207 / s15516709cog0901_7 .
↑ Geoffrey Hinton: A practical guide to training restricted Boltzmann machines . 2010.
↑ Ruslan Salakhutdinov, Andriy Mnih, Geoffrey Hinton: Restricted Boltzmann machines for collaborative filtering . In: Proceedings of the 24th international conference on Machine learning . 2007, p. 791-798 .

[1] Ackley, David H .; Hinton, Geoffrey E .; Sejnowski, Terrence J. (1985). "A Learning Algorithm for Boltzmann Machines" (PDF). Cognitive Science 9 (1): 147-169. doi : 10.1207 / s15516709cog0901_7 .

[2] Geoffrey Hinton: A practical guide to training restricted Boltzmann machines . 2010.

[3] Ruslan Salakhutdinov, Andriy Mnih, Geoffrey Hinton: Restricted Boltzmann machines for collaborative filtering . In: Proceedings of the 24th international conference on Machine learning . 2007, p. 791-798 .