Reinforcement (psychology)

from Wikipedia, the free encyclopedia

Reinforcement is a term from behavioral biology and psychology , especially from behaviorism .

In conditioning , an event that increases the likelihood that a certain behavior will be exhibited is called reinforcement. A distinction is made between “positive” and “negative” reinforcement. Both cause a behavior is shown frequently, with the difference that at the positive amplifier - also reward called - a pleasant stimulus to a desired behavior inflicted (. Eg recognition, attention, money, chocolate) is and negative in the amplifier a un pleasant charm away is (z. B. the removal of fear, noise, an unpleasant activity). A negative reinforcer must not (as it often happens) be confused with a punishment. As punishment refers to an event in which the probability of occurrence is lowered a behavior. A distinction is made between type I of punishment, the addition of an unpleasant, i.e. aversive stimulus (e.g. beatings, insults, house arrest) and type II, the removal of a pleasant stimulus or the withdrawal of a privilege (e.g. TV ban, removal of one Toys, ignoring), also called deprivation .

The technical term for these forms of learning in which the organism learns through reactions of the environment to its behavior is instrumental or operant conditioning . So the consequences of behavior have an impact on behavior. According to Skinner , the chronological sequence is the only decisive factor (“conditioning takes place presumably because of the temporary relation only”, p. 168). The behavior analysis defines reinforcement and reinforcement purely formally, via the effect on the rate of behavior. For the theories as to why an amplifier acts as such, see the article Amplifier (Psychology) . Reinforcement can also be imagined in what is known as covert reinforcement (see covert conditioning ).

Positive reinforcement

One speaks of positive reinforcement when a behavior is followed by an event in the organism's environment and the probability of this behavior increasing as a result. The event in the organism's environment is called a positive reinforcer . What a positive reinforcer is can only be recognized by the consequences it has for the likelihood of the behavior occurring. Positive reinforcers are therefore only formally defined, not in terms of content. Strictly speaking, one cannot say in advance whether a particular event is a positive reinforcer, a negative reinforcer, or irrelevant. Nevertheless, one can make well-founded assumptions: Whether an event (e.g. feeding) is a positive reinforcer depends on the situation. a. on whether the organism is deprived of it, d. H. the event (e.g. feeding) has not occurred for a long time. Reinforcers can be primary (species-specific innate, e.g. food, appropriate temperature, opportunity for sexual activity) or secondary (conditioned or learned; in humans e.g. success, money, recognition). The colloquial equivalent of “positive reinforcer” is often “reward” or “pleasant consequence”. However, this contradicts the purely formal definition of “positive reinforcer” according to Skinner , since these terms contain assumptions about supposed mental states of the organism.

  • Example: A rat that has been kept without food for 24 hours sits in a cage with uniformly smooth walls in which the only differently designed object is a small movable lever and a dispensing shaft for food is attached near it. When the rat presses this lever, some grains of food automatically fall into the output chute: The behavior (= accidental lever pressure) of the hungry rat has a positive consequence (for the rat) in the form of food distribution. In the medium term, this has the consequence that the rat will be near the output chute more frequently than before, thus also increasing the probability that the rat will press the lever again. After two or three dozen lever presses, the observer has the impression that the rat is deliberately pressing the lever to get food . - The behavior of pressing the lever has been reinforced, or to put it colloquially: The rat has " learned " how to get food. The reinforcer was the event of the feeding.

This contingent reinforcement is also referred to as triple contingency , because it is learned as follows: If stimulus A is present, reaction B is followed by amplifier C. The organisms thus learn that when stimulus A is present, but not another stimulus, their reaction (their Behavior) will most likely have a certain - pleasant - consequence on the part of the environment.

Negative reinforcement

Negative reinforcement is when an unpleasant stimulus is removed. The negative reinforcement leads - like the positive reinforcement - to an increase in the probability of behavior occurring . Reinforcement can also consist in avoiding an event (e.g. fear-inducing) in the organism's environment and subsequently increasing the rate of behavior.

Attention: Negative reinforcement must not be confused with punishment , which is intended to reduce (!) The frequency of behavior! The Negative reinforcement is not therefore called "negative" because something "negative" (eg. As a surge or the presence of an anxiety-provoking object) was terminated. Rather, the term is derived from the inverse application (something is taken away) of the reinforcement procedure.

  • Example: A rat is sitting in a cage, the iron floor is live. The rat now shows various behaviors, including: a. she presses the lever. As a consequence of the “push lever” behavior , the power is switched off. If the floor is energized again in later rounds, the rat pushes the lever earlier than before (and thus ends the electric shock). Finally, the rat pushes the lever before the current flows, thus avoiding the aversive stimulus (the current surge) .

From a behavioral perspective, maintaining phobias can also be viewed as a case of negative reinforcement. A person with a dog phobia changes z. B. the side of the road when a dog comes towards him. By changing the side of the street, he ends or avoids the fear-inducing contact with the dog. The phobic behavior of "changing the side of the street" is however intensified, i. H. here: maintain.


When avoidance is a form of negative reinforcement, prevents a reaction that an aversive stimulus occurs at all. For example, you can avoid the unpleasant consequences of late payment by paying a fine. There are two influential theories that explain avoidance behavior: the two-factor theory and the one -factor theory .

Two factor theory

Orval Hobart Mowrer's two-factor theory postulates, as the name suggests, two components of an avoidance response:

  1. Classical conditioning : learning to fear a previously neutral stimulus.
  2. Operant conditioning : reacting to escape the stimulus.

Some experimental evidence supports the two-factor theory, others criticize the two-factor theory: Individuals continue to carry out avoidance reactions even though they show no signs of fear, and avoidance reactions are generally resistant to extinction.

One factor theory

This was proposed as an alternative to the two-factor theory. In contrast to the two-factor theory, it assumes that the avoidance behavior does not require the removal of the learned fear-inducing stimulus. Avoidance of the aversive event itself is the reinforcer. It has been shown in experiments that test animals can learn avoidance reactions without there being a stimulus that announces an impending aversive stimulus.


Direct punishment - Type I punishment (also known as "positive" punishment) occurs when the operant behavior causes an event that leads to a decrease in the behavior rate in this situation. As punishment of electric shock can (in the behavioral sense), for example, are referred to a grazing animal receives when it touches the wire of the electric animal fence (if the animal's behavior, "touching the pasture fence" is rare in the future; this is called a "punishment "If the rate of this behavior falls due to a consequence of behavior). Another example of punishment is the loud "Ugh!" When a dog does something illegal (if the "Ugh" is a conditioned punishment for the dog) or a firm jerk on the leash .

Indirect punishment - Type II punishment (also “negative” punishment) is present when a previous event is ended due to operant behavior and the behavior rate decreases. In the “ Skinner Box ”, the rat no longer receives food at the push of a lever, as it did before. The rat will slowly stop pushing the lever. Or parents forbid their child to watch television (provided that television is pleasantly attractive to the child) if they have not followed certain family rules.

Deletion must be distinguished from punishment . A reinforcer that previously followed a behavior is no longer given. The rate of behavior then drops.

Punishment is the opposite of reinforcement: while reinforcement causes behavior to increase, punishment causes behavior to decrease. When a certain behavior is punished, the incidence of that behavior decreases, while other non-punished behaviors remain essentially unchanged.

Punitive stimuli

Similar to reinforcers, punitive stimuli can be divided into primary and conditioned stimuli.

  • Primary punitive stimuli are stimuli which usually have a direct effect on the organism and cause physical damage (e.g. blows).
  • Secondary punishments only become punishments through the individual learning history (e.g. admonitions).
  • Generalized criminal stimuli are stimuli that are coupled with a large number of other criminal stimuli (e.g. social exclusion).

Factors that affect the effectiveness of punishment

As part of the experimental behavior analysis , the effects of punishment on behavior have been extensively researched. The effectiveness of punishment depends on several factors:


If there is a permanent reduction in behavior, punishment should be used immediately with full intensity. Individuals can get used to a mild punishment ( habituation ) and can generalize this habituation to higher punishment intensities. So if you start with a mild punishment and gradually increase it, the effects on behavior are minor. If the punishment intensity is sufficiently high, however, the behavior will cease completely. In one experiment, for example, it was observed that a current surge of 80 volts was sufficient to permanently suppress behavior in pigeons. If, on the other hand, started with lower voltages, the pigeons got used to the punishment and carried out the behavior even in the event of electric shocks with 130 volts.


Similar to a reinforcer , a punishment stimulus is most effective when it immediately follows the behavior to be punished. The more time there is between behavior and punishment, the more ineffective the punishment. This has been demonstrated experimentally in rats, for example. Another study found that in school teaching, it is more effective for a teacher to reprimand bad behavior immediately rather than letting time pass.

Punishment plan

Analogous to the different reinforcement plans, different effects on behavior can also be determined with different punishment plans. In experiments with different punishment schemes, it has been found that punishing any behavior is most effective. In principle, punishment plans show the opposite of reinforcement plans: For example, where a certain reinforcement plan leads to an accelerated reaction pattern, a corresponding punishment plan leads to a slowed down reaction pattern.

Behavioral motivation

The effectiveness of punishment is inversely proportional to behavioral motivation. If an amplifier is "unattractive", the motivation to accept punishment in order to obtain this amplifier is low. For example, hungry pigeons respond little to punishment if their behavior allows them to get food. If, on the other hand, the pigeons are “full”, they will stop their behavior quickly if punished, as feed is not an effective reinforcement.

Available behavior alternatives

Behavioral punishment is effective when an alternative behavior is available to obtain the reinforcer that sustained the undesirable response. Therefore, behavior modification usually punishes one behavior while reinforcing an alternative behavior.

Discriminative cue

Punishment can also act as a discriminative cue, as a signal that announces the availability of other (pleasant or unpleasant) stimuli. If a punitive stimulus is contingently reinforced, the punitive stimulus becomes an indication of the reinforcement and the reaction rates increase instead of falling after the punishment is given. This may explain forms of masochism and self-harming behavior in which the actually painful stimulus is reinforced (e.g. attention).

Cons of Punishment

Although punishment in experiments can be an effective tool for influencing behavior, behavior analysts have identified a number of disadvantages, which in particular call into question the effectiveness of punishment in human society.

Punishment can have emotional consequences such as fear and anger, which in turn can have negative effects on various achievements. For example, students worked worse and slower in an experiment, if every mistake was punished with an electric shock, but if errors were only signaled by a tone, the performance was significantly better.

An even more dubious side effect of punishment is aggressiveness. Aggression can turn against the punishing person or against other organisms or things after a punishment. For example, in one experiment, rats that had previously lived together peacefully began to fight each other after being given electric shocks. Similar observations have been made in pigeons, mice, cats and monkeys.

Furthermore, punishment can lead to a general decrease in behavior, rather than just a decrease in the behavior being punished. For example, a severe reprimand from a teacher to an incorrect answer from a student can lead to the student not answering at all, even if he knows a clearly correct answer.

Another disadvantage is that punishment requires constant monitoring of behavior, while reinforcement does not. This is because, on the one hand, punishment is most effective when it follows the behavior consistently, and on the other hand, because it is not in an individual's interest to draw attention to behavior to be punished. For example, a child is less likely to point out to his parents that he has not done his homework if he has to fear punishment for it. Conversely, it will present its homework to the parents if it is reinforced.

It is reinforcing for individuals to avoid punishment (negative reinforcement). As a result, there may be an attempt to circumvent the rules or avoid the situation altogether. For example, a student evades detention by skipping school completely. It is even more problematic when the person who punishes himself becomes a conditioned penal stimulus. Under certain circumstances, the punishment is then linked to the person who punishes the offender rather than the behavior being punished.

In addition to these empirically established limitations, there are also ethical concerns about the use of punishment, especially against people. If positive reinforcement can produce results that are as good or even better, why should one resort to punishment? For these reasons, punishment is used as a means of controlling behavior in behavior analysis only in exceptional cases and when positive reinforcement is not an alternative.

The contingency scheme

Holland and Skinner illustrate the terms mentioned in the so-called contingency scheme:

performance Elimination
positive reinforcer positive reinforcement Punishment (type II)
negative amplifier Punishment (type I) negative reinforcement

Colloquially , these terms could be rewritten as follows:

  • Positive reinforcement means: you do something more often because you get something pleasant in return (e.g. a student answers and is praised; he will answer more often in the future).
  • Negative reinforcement means: you do something more often because it ends or avoids something unpleasant (e.g. a student does his homework completely and a previous TV ban is lifted; he does his homework more often in the future).
  • Punishment (type I, also "direct punishment") means: you do something less often or not at all because something unpleasant would then happen to you and has already happened once (example: a child lies, is scolded for it and lies less often in the future; or: a child touches a hot stove top and burns their fingers, the child will no longer touch the hot stove top in the future).
  • Punishment by loss (type II, also "indirect punishment") means: You do something less often because you would otherwise lose something pleasant (e.g. a child lies and is withdrawn from pocket money and consequently lies less).

Behavioral science and lay psychological terminology

The colloquial descriptions mentioned are only used for clarification and necessarily simplify things. They do not replace the correct definitions (see above) and cannot be used synonymously with them.

The (colloquial) “reward” does not always lead to an increase in the rate of behavior. So not every reward (intended as such) is an amplifier. In addition, one person is rewarded ; only one behavior can be reinforced . The same applies to (colloquial) punishment: Not every punishment intended as such has the effect of reducing the rate of behavior. In addition, (colloquial) rewarding and punishing are always active actions by one person on another: the mother rewards the child with a bar of chocolate, the teacher punishes the student with detention . Reinforcement also takes place in nature, without human intervention. The turning of the ignition key by the driver is positively reinforced by the starting of the engine: Nobody has to sit next to the driver and praise him for it or the like. That this is a case of positive amplification can be seen when the usual amplifier “engine starts” does not materialize: the driver will no longer show the behavior “turn the ignition key”, the behavior will be extinguished (not without the usual extinction burst being shown beforehand (i.e. the driver tries again for a while before giving up trying to start the car).

See also

Individual evidence

  1. ^ BF Skinner: Superstition in the pigeon. In: Journal of Experimental Psychology . Princeton NJ 38.1948, pp. 168-172 ISSN  0022-1015
  2. Michael Linden, Martin Hautzinger: Behavioral therapy: techniques and individual procedures . Springer-Verlag, 2013, ISBN 978-3-662-22591-2 , pp. 325 ( limited preview in Google Book Search).
  3. a b c d e James E. Mazur :: Learning and behavior . 6th edition. Pearson Studium, Hallbergmoos, ISBN 3-8273-7218-6 , p. 278 .
  4. a b R. J. Herrnstein: Method and theory in the study of avoidance . In: Psychological Review . tape 76 , no. 1 , January 1, 1969, ISSN  0033-295X , p. 49-69 , PMID 5353378 .
  5. ^ RL Solomon, LJ Kamin, LC Wynne: Traumatic avoidance learning: the outcomes of several extinction procedures with dogs . In: Journal of Abnormal Psychology . tape 48 , no. 2 , April 1, 1953, ISSN  0021-843X , p. 291-302 , PMID 13052353 .
  6. ^ NE Miller: Studies of fear as an acquirable drive fear as motivation and fear-reduction as reinforcement in the learning of new responses . In: Journal of Experimental Psychology . tape 38 , no. 1 , February 1, 1948, ISSN  0022-1015 , p. 89-101 , PMID 18910262 .
  7. RG Weisman, JS Litner: Positive conditioned reinforcement of Sidman avoidance behavior in rats. In: Journal of Comparative and Physiological Psychology . tape 68 , no. 4 , August 1, 1969, ISSN  0021-9940 , p. 597-603 , doi : 10.1037 / h0027682 .
  8. PJ Bersh, JM Notterman, WN Schoenfeld: Extinction of a human cardiac-response during avoidance-conditioning . In: The American Journal of Psychology . tape 69 , no. 2 , June 1, 1956, ISSN  0002-9556 , p. 244-251 , PMID 13327085 .
  9. ^ RL Solomon, LC Wynne: Traumatic avoidance learning: the principles of anxiety conservation and partial irreversibility . In: Psychological Review . tape 61 , no. 6 , November 1, 1954, ISSN  0033-295X , p. 353-385 , PMID 13215688 .
  10. RJ Herrnstein, PN Hineline: Negative reinforcement as shock-frequency reduction . In: Journal of the Experimental Analysis of Behavior . tape 9 , no. 4 , July 1, 1966, ISSN  0022-5002 , p. 421-430 , PMID 5961510 , PMC 1338243 (free full text).
  11. a b Christoph Bördlein: Introduction to behavior analysis . 1st edition. Alibri, Aschaffenburg 2015, ISBN 978-3-86569-232-0 , pp. 145 .
  12. ^ A b Nathan H. Azrin, William C. Holz: Punishment . In: Werner K. Honig (Ed.): Operant Behavior: Areas of Research and Application . 1966th edition. Appleton-Century-Crofts, New York 1966, pp. 380-447 .
  13. ^ NH Azrin, WC Holz, DF Hake: Fixed-ratio punishment . In: Journal of the Experimental Analysis of Behavior . tape 6 , no. 2 , April 1, 1963, pp. 141–148 , doi : 10.1901 / jeab.1963.6-141 , PMID 13965779 , PMC 1404287 (free full text).
  14. ^ Alan Baron, Arnold Kaufman, Dan Fazzini: Density and delay of punishment of free-operant avoidance . In: Journal of the Experimental Analysis of Behavior . tape 12 , no. 6 , November 1, 1969, ISSN  0022-5002 , p. 1029-1037 , doi : 10.1901 / jeab.1969.12-1029 , PMID 16811408 , PMC 1338715 (free full text).
  15. Ann J. Abramowitz, Susan G. O'Leary: Effectiveness of delayed punishment in an applied setting . In: Behavior Therapy . tape 21 , no. 2 , January 1, 1990, p. 231-239 , doi : 10.1016 / S0005-7894 (05) 80279-5 ( [accessed March 7, 2017]).
  16. Ennio Cipani, Janet Brendlinger, Linda McDowell, Stacey Usher: Continuous vs. Intermittent punishment: A case study . In: Journal of Developmental and Physical Disabilities . tape 3 , no. 2 , ISSN  1056-263X , p. 147-156 , doi : 10.1007 / BF01045930 .
  17. ^ NH Azrin, WC Holz, DF Hake, T. Ayllon: Fixed-ratio escape reinforcement . In: Journal of the Experimental Analysis of Behavior . tape 6 , no. 3 , July 1, 1963, ISSN  0022-5002 , p. 449–456 , doi : 10.1901 / jeab.1963.6-449 , PMID 13965780 , PMC 1404469 (free full text).
  18. ^ Marie T. Balaban, Dell L. Rhodes, Allen Neuringer: Orienting and defense responses to punishment: Effects on learning . In: Biological Psychology . tape 30 , no. 3 , June 1, 1990, pp. 203-217 , doi : 10.1016 / 0301-0511 (90) 90140-R ( [accessed March 7, 2017]).
  19. ^ RE Ulrich, NH Azrin: Reflexive fighting in response to aversive stimulation . In: Journal of the Experimental Analysis of Behavior . tape 5 , no. 4 , October 1, 1962, ISSN  0022-5002 , p. 511-520 , doi : 10.1901 / jeab.1962.5-511 , PMID 13995319 , PMC 1404196 (free full text).
  20. ^ Dorothea C Lerman, Christina M Vorndran: On the status of knowledge for using punishment implications for treating behavior disorders. In: Journal of Applied Behavior Analysis . tape 35 , no. 4 , January 1, 2002, ISSN  0021-8855 , p. 431-464 , doi : 10.1901 / jaba.2002.35-431 , PMID 12555918 , PMC 1284409 (free full text).
  21. ^ JG Holland, BF Skinner: Analysis of the behavior . Urban & Schwarzenberg, Munich 1974, p. 218.
  22. ^ A. Charles Catania: Learning. Interim (4th) Edition. Sloan Publishing, Cornwall-on-Hudson, NY 2007, ISBN 1-59738-007-5 .
  23. ^ Paul Chance: Learning and Behavior. Brooks / Cole Publishing Company, Pacific Grove 1999, ISBN 0-534-34691-X .
  24. "People are rewarded, but behavior is reinforced," BF Skinner: world western What is wrong with daily life in the? In: American Psychologist . Washington DC 41.1986, no. 5, pp. 568-574 ISSN  0003-066X (p. 569).