Reinforcement Theories: Positive, Negative, and Intermittent, Study notes of Psychology

The concepts of reinforcement, its types including positive and negative reinforcement, and the impact of intermittent reinforcement on behavior. Skinner's views on reinforcement and its comparison to punishment are also discussed. The document further delves into the effects of reinforcement on memory and learning theory.

Typology: Study notes

2021/2022

Uploaded on 09/12/2022

ehimay
ehimay 🇺🇸

4.8

(20)

268 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Reinforcement
This article is about the psychological concept. For
the construction materials reinforcement, see Rebar.
For reinforcement learning in computer science, see
Reinforcement learning. For beam stiffening, see
Stiffening.
In behavioral psychology,reinforcement is a
Diagram of operant conditioning
consequence that will strengthen an organism’s future be-
havior whenever that behavior is preceded by a specific
antecedent stimulus. This strengthening effect may be
measured as a higher frequency of behavior (e.g., pulling
a lever more frequently), longer duration (e.g., pulling
a lever for longer periods of time), greater magnitude
(e.g., pulling a lever with greater force), or shorter la-
tency (e.g., pulling a lever more quickly following the an-
tecedent stimulus).
Although in many cases a reinforcing stimulus is a re-
warding stimulus which is “valued” or “liked” by the indi-
vidual (e.g., moneyreceived froma slot machine, the taste
of the treat, the euphoria produced by an addictive drug),
this is not a requirement. Indeed, reinforcement does not
even require an individual to consciously perceive an ef-
fect elicited by the stimulus.[1] Furthermore, stimuli that
are “rewarding” or “liked” are not always reinforcing: if
an individual eats at a fast food restaurant (response) and
likes the taste of the food (stimulus), but believes it is
bad for their health, they may not eat it again and thus
it was not reinforcing in that condition. Thus, reinforce-
ment occurs only if there is an observable strengthening
in behavior.
In most cases reinforcement refers to an enhancement of
behavior but this term may also refer to an enhancement
of memory. One example of this effect is called post-
training reinforcement where a stimulus (e.g. food) given
shortly after a training session enhances the learning.[2]
This stimulus can also be an emotional one. A good ex-
ample is that many people can explain in detail where
they were when they found out the World Trade Center
was attacked.[3][4]
Reinforcement is an important part of operant or
instrumental conditioning.
1 Introduction
B.F. Skinner was a well-known and influential researcher
who articulated many of the theoretical constructs of
reinforcement and behaviorism. Skinner defined rein-
forcers according to the change in response strength (re-
sponse rate) rather than to more subjective criteria, such
as what is pleasurable or valuable to someone. Accord-
ingly, activities, foods or items considered pleasant or en-
joyable may not necessarily be reinforcing (because they
produce no increase in the response preceding them).
Stimuli, settings, and activities only fit the definition of
reinforcers if the behavior that immediately precedes the
potential reinforcer increases in similar situations in the
future; for example, a child who receives a cookie when
he or she asks for one. If the frequency of “cookie-
requesting behavior” increases, the cookie can be seen
as reinforcing “cookie-requesting behavior”. If how-
ever, “cookie-requesting behavior” does not increase the
cookie cannot be considered reinforcing.
The sole criterion that determines if a stimulus is rein-
forcing is the change in probability of a behavior after
administration of that potential reinforcer. Other theo-
ries may focus on additional factors such as whether the
person expected a behavior to produce a given outcome,
but in the behavioral theory, reinforcement is defined by
an increased probability of a response.
The study of reinforcement has produced an enormous
body of reproducible experimental results. Reinforce-
ment is the central concept and procedure in special ed-
ucation,applied behavior analysis, and the experimental
analysis ofbehavior and is a core conceptin some medical
and psychopharmacology models, particularly addiction,
dependence, and compulsion.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Reinforcement Theories: Positive, Negative, and Intermittent and more Study notes Psychology in PDF only on Docsity!

Reinforcement

This article is about the psychological concept. For the construction materials reinforcement, see Rebar. For reinforcement learning in computer science, see Reinforcement learning. For beam stiffening, see Stiffening. In behavioral psychology, reinforcement is a

Diagram of operant conditioning

consequence that will strengthen an organism’s future be- havior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher frequency of behavior (e.g., pulling a lever more frequently), longer duration (e.g., pulling a lever for longer periods of time), greater magnitude (e.g., pulling a lever with greater force), or shorter la- tency (e.g., pulling a lever more quickly following the an- tecedent stimulus).

Although in many cases a reinforcing stimulus is a re- warding stimulus which is “valued” or “liked” by the indi- vidual (e.g., money received from a slot machine, the taste of the treat, the euphoria produced by an addictive drug), this is not a requirement. Indeed, reinforcement does not even require an individual to consciously perceive an ef- fect elicited by the stimulus.[1]^ Furthermore, stimuli that are “rewarding” or “liked” are not always reinforcing: if an individual eats at a fast food restaurant (response) and likes the taste of the food (stimulus), but believes it is bad for their health, they may not eat it again and thus it was not reinforcing in that condition. Thus, reinforce- ment occurs only if there is an observable strengthening in behavior.

In most cases reinforcement refers to an enhancement of

behavior but this term may also refer to an enhancement of memory. One example of this effect is called post- training reinforcement where a stimulus (e.g. food) given shortly after a training session enhances the learning.[2] This stimulus can also be an emotional one. A good ex- ample is that many people can explain in detail where they were when they found out the World Trade Center was attacked.[3][4] Reinforcement is an important part of operant or instrumental conditioning.

1 Introduction

B.F. Skinner was a well-known and influential researcher who articulated many of the theoretical constructs of reinforcement and behaviorism. Skinner defined rein- forcers according to the change in response strength (re- sponse rate) rather than to more subjective criteria, such as what is pleasurable or valuable to someone. Accord- ingly, activities, foods or items considered pleasant or en- joyable may not necessarily be reinforcing (because they produce no increase in the response preceding them). Stimuli, settings, and activities only fit the definition of reinforcers if the behavior that immediately precedes the potential reinforcer increases in similar situations in the future; for example, a child who receives a cookie when he or she asks for one. If the frequency of “cookie- requesting behavior” increases, the cookie can be seen as reinforcing “cookie-requesting behavior”. If how- ever, “cookie-requesting behavior” does not increase the cookie cannot be considered reinforcing. The sole criterion that determines if a stimulus is rein- forcing is the change in probability of a behavior after administration of that potential reinforcer. Other theo- ries may focus on additional factors such as whether the person expected a behavior to produce a given outcome, but in the behavioral theory, reinforcement is defined by an increased probability of a response. The study of reinforcement has produced an enormous body of reproducible experimental results. Reinforce- ment is the central concept and procedure in special ed- ucation, applied behavior analysis, and the experimental analysis of behavior and is a core concept in some medical and psychopharmacology models, particularly addiction, dependence, and compulsion.

2 3 OPERANT CONDITIONING

2 Brief history

Laboratory research on reinforcement is usually dated from the work of Edward Thorndike, known for his ex- periments with cats escaping from puzzle boxes.[8]^ A number of others continued this research, notably B.F. Skinner, who published his seminal work on the topic in The Behavior of Organisms, in 1938, and elaborated this research in many subsequent publications.[9]^ Notably Skinner argued that positive reinforcement is superior to punishment in shaping behavior.[10]^ Though punishment may seem just the opposite of reinforcement, Skinner claimed that they differ immensely, saying that positive reinforcement results in lasting behavioral modification (long-term) whereas punishment changes behavior only temporarily (short-term) and has many detrimental side- effects. A great many researchers subsequently expanded our understanding of reinforcement and challenged some of Skinner’s conclusions. For example, Azrin and Holz defined punishment as a “consequence of behavior that reduces the future probability of that behavior,”[11]^ and some studies have shown that positive reinforcement and punishment are equally effective in modifying behavior. Research on the effects of positive reinforcement, neg- ative reinforcement and punishment continue today as those concepts are fundamental to learning theory and ap- ply to many practical applications of that theory.

3 Operant conditioning

Main article: Operant conditioning

The term operant conditioning was introduced by B. F. Skinner to indicate that in his experimental paradigm the organism is free to operate on the environment. In this paradigm the experimenter cannot trigger the desirable response; the experimenter waits for the response to oc- cur (to be emitted by the organism) and then a poten- tial reinforcer is delivered. In the classical condition- ing paradigm the experimenter triggers (elicits) the de- sirable response by presenting a reflex eliciting stimulus, the Unconditional Stimulus (UCS), which he pairs (pre- cedes) with a neutral stimulus, the Conditional Stimulus (CS).

Reinforcer is a basic term in operant conditioning.

3.1 Reinforcement

Positive reinforcement occurs when a desirable event or stimulus is presented as a consequence of a behavior and the behavior increases.[12]:253^ A positive reinforcer is a stimulus event for which the animal will work in order to acquire it. Verbal and Physical reward is very useful positive reinforcement[13]

  • Example: Whenever a rat presses a button, it gets a treat. If the rat starts pressing the button more often, the treat serves to positively reinforce this behavior.
  • Example: A father gives candy to his daughter when she picks up her toys. If the frequency of picking up the toys increases, the candy is a positive reinforcer (to reinforce the behavior of cleaning up).
  • Example: A company enacts a rewards program in which employees earn prizes dependent on the num- ber of items sold. The prizes the employees receive are the positive reinforcement as they increase sales.

Negative reinforcement occurs when the rate of a be- havior increases because an aversive event or stimulus is removed or prevented from happening.[12]:253^ A negative reinforcer is a stimulus event for which an organism will work in order to terminate, to escape from, to postpone its occurrence. As opposed to positive reinforcement, Verbal and Physical Punishment may apply in negative reinforcement[14]

  • Example: A child cleans his or her room, and this behavior is followed by the parent stopping “nag- ging” or asking the child repeatedly to do so. Here, the nagging serves to negatively reinforce the behav- ior of cleaning because the child wants to remove that aversive stimulus of nagging.
  • Example: A person puts ointment on a bug bite to soothe an itch. If the ointment works, the person will likely increase the usage of the ointment be- cause it resulted in removing the itch, which is the negative reinforcer.
  • Example: A company has a policy that if an em- ployee completes their assigned work by Friday, they can have Saturday off. Working Saturday is the negative reinforcer, the employee’s productivity will be increased as they avoid experiencing the negative reinforcer.

3.2 Punishment

Positive punishment occurs when a response pro- duces a stimulus and that responses decreases in probability in the future in similar circumstances.

  • Example: A mother yells at a child when he or she runs into the street. If the child stops running into the street, the yelling ceases. The yelling acts as pos- itive punishment because the mother presents (adds) an unpleasant stimulus in the form of yelling.

Negative punishment occurs when a response produces the removal of a stimulus and that response decreases in probability in the future in similar circumstances.

4 4 NATURAL AND ARTIFICIAL

3.5 Primary vs. Secondary Punishers

The same distinction between primary and secondary can me made for punishers. Pain, loud noises, bright lights, and exclusion are all things that would pass the “caveman test” as an aversive stimulus, and are therefore primary punishers. The sound of someone booing, the wrong- answer buzzer on a game show, and a ticket on your car windshield are all things you have learned to think about as negative.

3.6 Other reinforcement terms

  • A generalized reinforcer is a conditioned reinforcer that has obtained the reinforcing function by pair- ing with many other reinforcers and functions as a reinforcer under a wide-variety of motivating oper- ations. (One example of this is money because it is paired with many other reinforcers).[18]:
  • In reinforcer sampling, a potentially reinforcing but unfamiliar stimulus is presented to an organism without regard to any prior behavior.
  • Socially-mediated reinforcement (direct reinforce- ment) involves the delivery of reinforcement that re- quires the behavior of another organism.
  • The Premack principle is a special case of rein- forcement elaborated by David Premack, which states that a highly preferred activity can be used effectively as a reinforcer for a less-preferred activity.[18]:
  • Reinforcement hierarchy is a list of actions, rank- ordering the most desirable to least desirable conse- quences that may serve as a reinforcer. A reinforce- ment hierarchy can be used to determine the relative frequency and desirability of different activities, and is often employed when applying the Premack prin- ciple.
  • Contingent outcomes are more likely to reinforce behavior than non-contingent responses. Contingent outcomes are those directly linked to a causal be- havior, such a light turning on being contingent on flipping a switch. Note that contingent outcomes are not necessary to demonstrate reinforcement, but perceived contingency may increase learning.
  • Contiguous stimuli are stimuli closely associated by time and space with specific behaviors. They reduce the amount of time needed to learn a behavior while increasing its resistance to extinction. Giving a dog a piece of food immediately after sitting is more con- tiguous with (and therefore more likely to reinforce) the behavior than a several minute delay in food de- livery following the behavior. - Noncontingent reinforcement refers to response- independent delivery of stimuli identified as rein- forcers for some behaviors of that organism. How- ever, this typically entails time-based delivery of stimuli identified as maintaining aberrant behavior, which decreases the rate of the target behavior.[19] As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent “reinforcement”.[20]

4 Natural and artificial

In his 1967 paper, Arbitrary and Natural Reinforcement , Charles Ferster proposed classifying reinforcement into events that increase frequency of an operant as a natu- ral consequence of the behavior itself, and events that are presumed to affect frequency by their requirement of hu- man mediation, such as in a token economy where sub- jects are “rewarded” for certain behavior with an arbitrary token of a negotiable value. In 1970, Baer and Wolf cre- ated a name for the use of natural reinforcers called “be- havior traps”.[21]^ A behavior trap requires only a simple response to enter the trap, yet once entered, the trap can- not be resisted in creating general behavior change. It is the use of a behavioral trap that increases a person’s repertoire, by exposing them to the naturally occurring reinforcement of that behavior. Behavior traps have four characteristics:

  • They are “baited” with virtually irresistible rein- forcers that “lure” the student to the trap
  • Only a low-effort response already in the repertoire is necessary to enter the trap
  • Interrelated contingencies of reinforcement inside the trap motivate the person to acquire, extend, and maintain targeted academic/social skills[22]
  • They can remain effective for long periods of time because the person shows few, if any, satiation ef- fects

As can be seen from the above, artificial reinforcement is in fact created to build or develop skills, and to general- ize, it is important that either a behavior trap is introduced to “capture” the skill and utilize naturally occurring rein- forcement to maintain or increase it. This behavior trap may simply be a social situation that will generally result from a specific behavior once it has met a certain crite- rion (e.g., if you use edible reinforcers to train a person to say hello and smile at people when they meet them, after that skill has been built up, the natural reinforcer of other people smiling, and having more friendly interactions will naturally reinforce the skill and the edibles can be faded).

5.1 Simple schedules 5

5 Intermittent reinforcement;

schedules

Much behavior is not reinforced every time it is emitted, and the pattern of intermittent reinforcement strongly af- fects how fast an operant response is learned, what its rate is at any given time, and how long it continues when re- inforcement ceases. The simplest rules controlling rein- forcement are continuous reinforcement, where every re- sponse is reinforced, and extinction, where no response is reinforced. Between these extremes, more complex “schedules of reinforcement” specify the rules that de- termine how and when a response will be followed by a reinforcer.

Specific schedules of reinforcement reliably induce spe- cific patterns of response, irrespective of the species be- ing investigated (including humans in some conditions). However, the quantitative properties of behavior under a given schedule depend on the parameters of the schedule, and sometimes on other, non-schedule factors. The or- derliness and predictability of behavior under schedules of reinforcement was evidence for B.F. Skinner's claim that by using operant conditioning he could obtain “con- trol over behavior”, in a way that rendered the theoretical disputes of contemporary comparative psychology obso- lete. The reliability of schedule control supported the idea that a radical behaviorist experimental analysis of behavior could be the foundation for a psychology that did not refer to mental or cognitive processes. The relia- bility of schedules also led to the development of applied behavior analysis as a means of controlling or altering be- havior.

Many of the simpler possibilities, and some of the more complex ones, were investigated at great length by Skin- ner using pigeons, but new schedules continue to be de- fined and investigated.

5.1 Simple schedules

  • Ratio schedule – the reinforcement depends only on the number of responses the organism has per- formed.
  • Continuous reinforcement (CRF) – a schedule of reinforcement in which every occurrence of the in- strumental response (desired response) is followed by the reinforcer.[18]: - Lab example: each time a rat presses a bar it gets a pellet of food. - Real world example: each time a dog defecates outside its owner gives it a treat; each time a person puts $1 in a candy machine and presses the buttons he receives a candy bar.

Simple schedules have a single rule to determine when a single type of reinforcer is delivered for specific response.

A chart demonstrating the different response rate of the four sim- ple schedules of reinforcement, each hatch mark designates a re- inforcer being given

  • Fixed ratio (FR) – schedules deliver reinforcement after every n th response.[18]: - Example: FR2” = every second desired re- sponse the subject makes is reinforced. - Lab example: FR5” = rat’s bar-pressing be- havior is reinforced with food after every 5 bar-presses in a Skinner box. - Real-world example: FR10” = Used car dealer gets a $1000 bonus for each 10 cars sold on the lot.
  • Variable ratio schedule (VR) – reinforced on av- erage every n th response, but not always on the n th response.[18]: - Lab example: VR4” = first pellet delivered on 2 bar presses, second pellet delivered on 6 bar presses, third pellet 4 bar presses (2 + 6 + 4 = 12; 12/3= 4 bar presses to receive pellet). - Real-world example: slot machines (because, though the probability of hitting the jackpot is constant, the number of lever presses needed to hit the jackpot is variable).
  • Fixed interval (FI) – reinforced after n amount of time. - Example: FI1” = reinforcement provided for the first response after 1 second. - Lab example: FI15” = rat’s bar-pressing be- havior is reinforced for the first bar press after 15 seconds passes since the last reinforcement. - Real world example: washing machine cycle.
  • Variable interval (VI) – reinforced on an average of n amount of time, but not always exactly n amount of time.[18]:

5.3 Superimposed schedules 7

  • Organisms whose schedules of reinforcement are “thinned” (that is, requiring more responses or a greater wait before reinforcement) may experience “ratio strain” if thinned too quickly. This produces behavior similar to that seen during extinction. - Ratio strain: the disruption of responding that occurs when a fixed ratio response requirement is increased too rapidly. - Ratio run: high and steady rate of responding that completes each ratio requirement. Usu- ally higher ratio requirement causes longer post-reinforcement pauses to occur.
  • Partial reinforcement schedules are more resistant to extinction than continuous reinforcement schedules. - Ratio schedules are more resistant than inter- val schedules and variable schedules more re- sistant than fixed ones. - Momentary changes in reinforcement value lead to dynamic changes in behavior.[24]

5.2 Compound schedules

Compound schedules combine two or more different sim- ple schedules in some way using the same reinforcer for the same behavior. There are many possibilities; among those most often used are:

  • Alternative schedules – A type of compound schedule where two or more simple schedules are in effect and whichever schedule is completed first results in reinforcement.[25]
  • Conjunctive schedules – A complex schedule of reinforcement where two or more simple schedules are in effect independently of each other, and re- quirements on all of the simple schedules must be met for reinforcement.
  • Multiple schedules – Two or more schedules alter- nate over time, with a stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect. - Example: FR4 when given a whistle and FI when given a bell ring.
  • Mixed schedules – Either of two, or more, sched- ules may occur with no stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect. - Example: FI6 and then VR3 without any stim- ulus warning of the change in schedule.
  • Concurrent schedules – A complex reinforcement procedure in which the participant can choose any

one of two or more simple reinforcement sched- ules that are available simultaneously. Organisms are free to change back and forth between the re- sponse alternatives at any time.

  • Real world example: changing channels on a television.
  • Concurrent-chain schedule of reinforcement – A complex reinforcement procedure in which the participant is permitted to choose during the first link which of several simple reinforcement sched- ules will be in effect in the second link. Once a choice has been made, the rejected alternatives be- come unavailable until the start of the next trial.
  • Interlocking schedules – A single schedule with two components where progress in one component affects progress in the other component. An inter- locking FR60–FI120, for example, each response subtracts time from the interval component such that each response is “equal” to removing two seconds from the FI.
  • Chained schedules – Reinforcement occurs after two or more successive schedules have been com- pleted, with a stimulus indicating when one schedule has been completed and the next has started
  • Example: FR10 in a green light when com- pleted it goes to a yellow light to indicate FR3, after it is completed it goes into red light to indicate VI6, etc. At the end of the chain, a reinforcer is given.
  • Tandem schedules – Reinforcement occurs when two or more successive schedule requirements have been completed, with no stimulus indicating when a schedule has been completed and the next has started.
  • Example: VR10, after it is completed the schedule is changed without warning to FR10, after that it is changed without warning to FR16, etc. At the end of the series of sched- ules, a reinforcer is finally given.
  • Higher-order schedules – completion of one schedule is reinforced according to a second sched- ule; e.g. in FR2 (FI10 secs), two successive fixed interval schedules require completion before a re- sponse is reinforced.

5.3 Superimposed schedules

The psychology term superimposed schedules of rein- forcement refers to a structure of rewards where two or more simple schedules of reinforcement operate simulta- neously. Reinforcers can be positive, negative, or both. An example is a person who comes home after a long

8 5 INTERMITTENT REINFORCEMENT; SCHEDULES

day at work. The behavior of opening the front door is re- warded by a big kiss on the lips by the person’s spouse and a rip in the pants from the family dog jumping enthusias- tically. Another example of superimposed schedules of reinforcement is a pigeon in an experimental cage peck- ing at a button. The pecks deliver a hopper of grain every 20th peck, and access to water after every 200 pecks.

Superimposed schedules of reinforcement are a type of compound schedule that evolved from the initial work on simple schedules of reinforcement by B.F. Skinner and his colleagues (Skinner and Ferster, 1957). They demon- strated that reinforcers could be delivered on schedules, and further that organisms behaved differently under dif- ferent schedules. Rather than a reinforcer, such as food or water, being delivered every time as a consequence of some behavior, a reinforcer could be delivered after more than one instance of the behavior. For example, a pigeon may be required to peck a button switch ten times before food appears. This is a “ratio schedule”. Also, a reinforcer could be delivered after an interval of time passed following a target behavior. An example is a rat that is given a food pellet immediately following the first response that occurs after two minutes has elapsed since the last lever press. This is called an “interval schedule”.

In addition, ratio schedules can deliver reinforcement fol- lowing fixed or variable number of behaviors by the indi- vidual organism. Likewise, interval schedules can deliver reinforcement following fixed or variable intervals of time following a single response by the organism. Individual behaviors tend to generate response rates that differ based upon how the reinforcement schedule is created. Much subsequent research in many labs examined the effects on behaviors of scheduling reinforcers.

If an organism is offered the opportunity to choose be- tween or among two or more simple schedules of re- inforcement at the same time, the reinforcement struc- ture is called a “concurrent schedule of reinforcement”. Brechner (1974, 1977) introduced the concept of super- imposed schedules of reinforcement in an attempt to cre- ate a laboratory analogy of social traps, such as when hu- mans overharvest their fisheries or tear down their rain- forests. Brechner created a situation where simple rein- forcement schedules were superimposed upon each other. In other words, a single response or group of responses by an organism led to multiple consequences. Concur- rent schedules of reinforcement can be thought of as “or” schedules, and superimposed schedules of reinforcement can be thought of as “and” schedules. Brechner and Lin- der (1981) and Brechner (1987) expanded the concept to describe how superimposed schedules and the social trap analogy could be used to analyze the way energy flows through systems.

Superimposed schedules of reinforcement have many real-world applications in addition to generating social traps. Many different human individual and social situa- tions can be created by superimposing simple reinforce-

ment schedules. For example, a human being could have simultaneous tobacco and alcohol addictions. Even more complex situations can be created or simulated by super- imposing two or more concurrent schedules. For exam- ple, a high school senior could have a choice between go- ing to Stanford University or UCLA, and at the same time have the choice of going into the Army or the Air Force, and simultaneously the choice of taking a job with an in- ternet company or a job with a software company. That is a reinforcement structure of three superimposed con- current schedules of reinforcement. Superimposed schedules of reinforcement can create the three classic conflict situations (approach–approach conflict, approach–avoidance conflict, and avoidance– avoidance conflict) described by Kurt Lewin (1935) and can operationalize other Lewinian situations analyzed by his force field analysis. Other examples of the use of su- perimposed schedules of reinforcement as an analytical tool are its application to the contingencies of rent control (Brechner, 2003) and problem of toxic waste dumping in the Los Angeles County storm drain system (Brechner, 2010).

5.4 Concurrent schedules

In operant conditioning, concurrent schedules of rein- forcement are schedules of reinforcement that are simul- taneously available to an animal subject or human partic- ipant, so that the subject or participant can respond on either schedule. For example, in a two-alternative forced choice task, a pigeon in a Skinner box is faced with two pecking keys; pecking responses can be made on either, and food reinforcement might follow a peck on either. The schedules of reinforcement arranged for pecks on the two keys can be different. They may be independent, or they may be linked so that behavior on one key affects the likelihood of reinforcement on the other. It is not necessary for responses on the two schedules to be physically distinct. In an alternate way of arranging con- current schedules, introduced by Findley in 1958, both schedules are arranged on a single key or other response device, and the subject can respond on a second key to change between the schedules. In such a “Findley con- current” procedure, a stimulus (e.g., the color of the main key) signals which schedule is in effect. Concurrent schedules often induce rapid alternation be- tween the keys. To prevent this, a “changeover delay” is commonly introduced: each schedule is inactivated for a brief period after the subject switches to it. When both the concurrent schedules are variable inter- vals, a quantitative relationship known as the matching law is found between relative response rates in the two schedules and the relative reinforcement rates they de- liver; this was first observed by R.J. Herrnstein in 1961. Matching law is a rule for instrumental behavior which states that the relative rate of responding on a particu-

10 13 REFERENCES

MPR, short for mathematical principles of reinforce- ment. Killeen and Sitomer are among the key researchers in this field.

10 Criticisms

The standard definition of behavioral reinforcement has been criticized as circular, since it appears to argue that response strength is increased by reinforcement, and de- fines reinforcement as something that increases response strength (i.e., response strength is increased by things that increase response strength). However, the correct usage[30]^ of reinforcement is that something is a rein- forcer because of its effect on behavior, and not the other way around. It becomes circular if one says that a par- ticular stimulus strengthens behavior because it is a rein- forcer, and does not explain why a stimulus is producing that effect on the behavior. Other definitions have been proposed, such as F.D. Sheffield’s “consummatory behav- ior contingent on a response”, but these are not broadly used in psychology.[31]

10.1 History of the terms

In the 1920s Russian physiologist Ivan Pavlov may have been the first to use the word reinforcement with respect to behavior, but (according to Dinsmoor) he used its ap- proximate Russian cognate sparingly, and even then it re- ferred to strengthening an already-learned but weakening response. He did not use it, as it is today, for selecting and strengthening new behaviors. Pavlov’s introduction of the word extinction (in Russian) approximates today’s psychological use.

In popular use, positive reinforcement is often used as a synonym for reward , with people (not behavior) thus be- ing “reinforced”, but this is contrary to the term’s consis- tent technical usage, as it is a dimension of behavior, and not the person, which is strengthened. Negative reinforce- ment is often used by laypeople and even social scientists outside psychology as a synonym for punishment. This is contrary to modern technical use, but it was B.F. Skin- ner who first used it this way in his 1938 book. By 1953, however, he followed others in thus employing the word punishment , and he re-cast negative reinforcement for the removal of aversive stimuli.

There are some within the field of behavior analysis[32] who have suggested that the terms “positive” and “nega- tive” constitute an unnecessary distinction in discussing reinforcement as it is often unclear whether stimuli are being removed or presented. For example, Iwata poses the question: "...is a change in temperature more accu- rately characterized by the presentation of cold (heat) or the removal of heat (cold)?"[33]:363^ Thus, reinforcement could be conceptualized as a pre-change condition re- placed by a post-change condition that reinforces the be-

havior that followed the change in stimulus conditions.

11 Applications

11.1 Climate of fear

Main article: Climate of fear

Partial or intermittent negative reinforcement can create an effective climate of fear and doubt.[34]

11.2 Nudge theory

Main article: Nudge theory

Nudge theory (or nudge) is a concept in behavioural sci- ence, political theory and economics which argues that positive reinforcement and indirect suggestions to try to achieve non-forced compliance can influence the motives, incentives and decision making of groups and individuals, at least as effectively – if not more effectively – than di- rect instruction, legislation, or enforcement.

12 See also

  • Applied behavior analysis
  • Behavioral cusp
  • Child grooming
  • Dog training
  • Learned industriousness
  • Overjustification effect
  • Power and control in abusive relationships
  • Psychological manipulation
  • Punishment
  • Reinforcement learning
  • Reward system
  • Society for Quantitative Analysis of Behavior
  • Traumatic bonding

13 References

[1] Winkielman P., Berridge KC, and Wilbarger JL. (2005). Unconscious affective reactions to masked happy verses angry faces influence consumption behavior and judge- ment value. Pers Soc Psychol Bull : 31, 121–35.

[2] Mondadori C, Waser PG, and Huston JP. (2005). Time- dependent effects of post-trial reinforcement, punishment or ECS on passive avoidance learning. Physiol Behav : 18, 1103–9. PMID 928533

[3] White NM, Gottfried JA (2011). “Reward: What Is It? How Can It Be Inferred from Behavior?". PMID

[4] White NM. (2011). Reward: What is it? How can it be inferred from behavior. In: Neurobiology of Sensation and Reward. CRC Press PMID 22593908

[5] Malenka RC, Nestler EJ, Hyman SE (2009). “Chapter 15: Reinforcement and Addictive Disorders”. In Sydor A, Brown RY. Molecular Neuropharmacology: A Foundation for Clinical Neuroscience (2nd ed.). New York: McGraw- Hill Medical. pp. 364–375. ISBN 9780071481274.

[6] Nestler EJ (December 2013). “Cellular basis of memory for addiction”. Dialogues Clin. Neurosci. 15 (4): 431–

  1. PMC 3898681. PMID 24459410.

[7] “Glossary of Terms”. Mount Sinai School of Medicine. Department of Neuroscience. Retrieved 9 February 2015.

[8] Thorndike, E. L. “Some Experiments on Animal Intelli- gence,” Science, Vol. VII, January/June, 1898

[9] Skinner, B. F. “The Behavior of Organisms: An Exper- imental Analysis”, 1938 New York: Appleton-Century- Crofts

[10] Skinner, B.F. (1948). Walden Two. Toronto: The Macmillan Company.

[11] Honig, Werner (1966). Operant Behavior: Areas of Re- search and Application. New York: Meredith Publishing Company. p. 381.

[12] Flora, Stephen (2004). The Power of Reinforcement. Al- bany: State University of New York Press.

[13] Nikoletseas Michael M. (2010) Behavioral and Neural Plasticity, p. 143 ISBN 978-

[14] Nikoletseas Michael M. (2010) Behavioral and Neural Plasticity. ISBN 978-

[15] D'Amato, M. R. (1969). Melvin H. Marx, ed. Learn- ing Processes: Instrumental Conditioning. Toronto: The Macmillan Company.

[16] Harter, J. K. (2002). C. L. Keyes, ed. Well-Being in the Workplace and its Relationship to Business Outcomes: A Review of the Gallup Studies. Washington D.C.: Ameri- can Psychological Association.

[17] Skinner, B.F. (1974). About Behaviorism

[18] Miltenberger, R. G. “Behavioral Modification: Principles and Procedures”. Thomson/Wadsworth, 2008.

[19] Tucker, M.; Sigafoos, J. & Bushell, H. (1998). Use of noncontingent reinforcement in the treatment of challeng- ing behavior. Behavior Modification , 22, 529–47.

[20] Poling, A. & Normand, M. (1999). Noncontingent re- inforcement: an inappropriate description of time-based schedules that reduce behavior. Journal of Applied Be- havior Analysis , 32, 237–8.

[21] Baer and Wolf, 1970, The entry into natural communities of reinforcement. In R. Ulrich, T. Stachnik, & J. Mabry (eds.), Control of human behavior (Vol. 2, pp. 319–24). Gleenview, IL: Scott, Foresman.

[22] Kohler & Greenwood, 1986, Toward a technology of gen- eralization: The identification of natural contingencies of reinforcement. The Behavior Analyst , 9, 19–26.

[23] Derenne, A. & Flannery, K.A. (2007). Within Session FR Pausing. The Behavior Analyst Today , 8(2), 175– BAO

[24] McSweeney, F.K.; Murphy, E.S. & Kowal, B.P. (2001) Dynamic Changes in Reinforcer Value: Some Misconcep- tions and Why You Should Care. The Behavior Analyst Today , 2(4), 341–7 BAO

[25] Iversen, I.H. & Lattal, K.A. Experimental Analysis of Be- havior. 1991, Elsevier, Amsterdam.

[26] Toby L. Martin, C.T. Yu, Garry L. Martin & Daniela Fazzio (2006): On Choice, Preference, and Preference For Choice. The Behavior Analyst Today , 7(2), 234– BAO

[27] Schacter, Daniel L., Daniel T. Gilbert, and Daniel M. Wegner. “Chapter 7: Learning.” Psychology. ; Second Edition. N.p.: Worth, Incorporated, 2011. 284-85.

[28] Bettinghaus, Erwin P., Persuasive Communication , Holt, Rinehart and Winston, Inc., 1968

[29] Skinner, B.F., The Behavior of Organisms. An Experimen- tal Analysis , New York: Appleton-Century-Crofts. 1938

[30] Epstein, L.H. 1982. Skinner for the Classroom. Cham- paign, IL: Research Press

[31] Franco J. Vaccarino, Bernard B. Schiff & Stephen E. Glickman (1989). Biological view of reinforcement. in Stephen B. Klein and Robert Mowrer. Contemporary learning theories: Instrumental conditioning theory and the impact of biological constraints on learning. Hillsdale, NJ, Lawrence Erlbaum Associates

[32] Michael, J. (1975, 2005). Positive and negative reinforce- ment, a distinction that is no longer necessary; or a better way to talk about bad things. Journal of Organizational Behavior Management , 24, 207–22.

[33] Iwata, B.A. (1987). Negative reinforcement in applied behavior analysis: an emerging technology. Journal of Applied Behavior Analysis , 20, 361–78.

[34] Braiker, Harriet B. (2004). Who’s Pulling Your Strings? How to Break The Cycle of Manipulation. ISBN 0-07- 144672-9.

16 Text and image sources, contributors, and licenses

16.1 Text

  • Reinforcement Source: https://en.wikipedia.org/wiki/Reinforcement?oldid=693049246 Contributors: SimonP, Vaughan, Jfitzg, Avel- lano~enwiki, Trontonian, Furrykef, Hyacinth, Seglea, Elf, Michael Devore, Skagedal, Sam Hocevar, TronTonian, Discospinster, Rich Farm- brough, John FitzGerald, Bender235, Circeus, Johnkarp, Pearle, Nsaa, Jumbuck, Gary, Velella, AverageGuy, Heida Maria, Bobrg~enwiki, Angr, Uncle G, Macaddct1984, Marudubshinki, Mandarax, Graham87, Limegreen, Rjwilmsi, NeonMerlin, Keimzelle, Stimfunction, Epolk, Nesbit, Stephenb, Member, Msikma, Duldan, Janarius, ONEder Boy, RL0919, Epipelagic, Sandstein, Closedmouth, Studentne, Jff119, GraemeL, SmackBot, Fvguy72, Edgar181, Gilliam, Chris the speller, Friga, Radagast83, Khukri, Vina-iwbot~enwiki, SashatoBot, Swatjester, Mouse Nightshirt, Robofish, [email protected], Timhill2004, Jcbutler, Hydra Rider, Paulieraw, Raystorm, Dycedarg, Penbat, Montanabw, Mattisse, Darklilac, Magioladitis, Shikorina, Arno Matthias, Think outside the box, Nyttend, WhatamIdoing, Thuglas, Lunar Spectrum, WLU, MartinBot, Poeloq, Romistrub, DC2~enwiki, J.delanoy, Love Krittaya, Florkle, Kpmiyapuram, Ovidiu pos, LittleHow, Time River, Pundit, Arkabc, QuackGuru, PNG crusade bot, Charlesdrakew, Saibod, Lova Falk, Alcmaeonid, Doc James, Gbawden, SieBot, SheepNotGoats, Aylad, Nmfbd, Docfox, Rinconsoleao, Martarius, ClueBot, MLCommons, Av99, ArminHP, Kitsunegami, Rlaitinen, Jim- Cropcho, 1ForTheMoney, Aitias, Josh.Pritchard.DBA, Londonsista, Feinoha, Addbot, EdgeNavidad, Tcncv, Friginator, Download, Jarble, Luckas-bot, Yobot, AnomieBOT, Trevithj, Galoubet, Neptune5000, Jon187, LilHelpa, Xqbot, GrouchoBot, Aaron Kauppi, Erik9bot, Jmbrowne, GliderMaven, FrescoBot, D'ohBot, BenzolBot, Pinethicket, I dream of horses, Soderick, Anemone112, LawBot, Suffusion of Yellow, Mean as custard, TamaraDarian, Queeste, Efb18, John of Reading, Doncorto, Wikijord, Abolitionista, Smo.Kwon, Lavenbi, Donner60, GeoffreyMay, ClueBot NG, Amreshpandey, Widr, Spannerjam, Reify-tech, BG19bot, Dsomlo, Goldenlegion, MusikAnimal, Crh23, Nikkiopelli, Db4wp, Maestro814, Mogism, Nidzo7, Marybelr, Kernsters, Teacher1754, Epicgenius, Lilkid3194, Purplefanta1109, Tentinator, Jenniwey, Star767, Racolepsychcapstone, Seppi333, Roshu Bangal, Meyers.mike, VanishedUser sdu9aya9fs654654, Cervota, Hayleymcole, Andrewvgator, Superduck120, 2014psat, Daddiouconn, KasparBot, Chanellr, Jay daven, ScottRoberts and Anonymous: 213

16.2 Images

  • File:Operant_conditioning_diagram.png Source: https://upload.wikimedia.org/wikipedia/commons/1/16/Operant_conditioning_ diagram.png License: CC BY-SA 3.0 Contributors: I used Adobe illustrator Original artist: Studentne
  • File:Schedule_of_reinforcement.png Source: https://upload.wikimedia.org/wikipedia/commons/8/8f/Schedule_of_reinforcement.png License: Public domain Contributors: Transferred from en.wikipedia; transferred to Commons by User:Premeditated Chaos using CommonsHelper. Original artist: Original uploader was PNG crusade bot at en.wikipedia

16.3 Content license

  • Creative Commons Attribution-Share Alike 3.