How Reinforcement Schedules Work

Schedules of Reinforcement

How Reinforcement Schedules Work
Disclaimer | By: Gavin Cosgrave | Reading Time: 6.7 minutes «The schedule of reinforcement for a particular behaviour specifies whether every response is followed by reinforcement or whether only some responses are followed by reinforcement» — Miltenberger (2007, p.86)

What is a Schedule of Reinforcement?

A schedule of reinforcement is a protocol or set of rules that a teacher will follow when delivering reinforcers (e.g. tokens when using a token economy). The “rules” might state that reinforcement is given after every correct response to a question; or for every 2 correct responses; or for every 100 correct responses; or when a certain amount of time has elapsed.

Broadly speaking there are two categories of reinforcement schedule, the first being a «continuous» schedule and the other being an «intermittent» schedule.

A continuous schedule of reinforcement (sometimes abbreviated into CRF) occurs when reinforcement is delivered after every single target behaviour whereas an intermittent schedule of reinforcement (INT) means reinforcement is delivered after some behaviours or responses but never after each one.

Continuous reinforcement schedules are more often used when teaching new behaviours, while intermittent reinforcement schedules are used when maintaining previously learned behaviours (Cooper et al. 2007).

Continuous Schedule of Reinforcement (CRF)

Within an educational setting, a CRF would mean that the teacher would deliver reinforcement after every correct response from their student/s. For example, if you were teaching a student to read the letters A, B, C, and D, then everytime you presented one of these letters to your student and they correctly read the letter then you would deliver reinforcement.

For an everday example, every time you press the number 9 button on your television remote control your TV changes to channel 9; or every time you turn on your kettle it heats up the water inside it; or every time you turn on your kitchen tap (faucet) water flows it (unless any of these are broken of course).

A continuous schedule of reinforcement.

Intermittent Schedules of Reinforcement

There are four basic types of intermittent schedules of reinforcement and these are:

  • Fixed-Ratio (FR) Schedule.
  • Fixed Interval (FI) Schedule.
  • Variable-Ratio (VR) schedule.
  • Variable-Interval (VI) schedule.

Fixed-Ratio Schedule (FR)

A fixed-ratio schedule of reinforcement means that reinforcement should be delivered after a constant or “fixed” number of correct responses. For example, a fixed ratio schedule of 2 means reinforcement is delivered after every 2 correct responses. The chosen number could be 5, 10, 20 or it could be 100 or more; there is no limit but the number must be defined.

Generally, when writing out a fixed-ratio schedule into the discrete trial script it is shortened into just “FR” with the number of required correct responses stated after it (Malott & Trojan-Suarez, 2006).

For example, choosing to reinforce for every second correct response would be written as “FR2”; reinforcing for every fifth correct response would be an “FR5”; for every 100 correct responses would be an “FR100” and so on.

Note that when running an ABA programme, you may see the reinforcement schedule defined as “FR1”. Technically this is a continuous reinforcement schedule (CRF) but to keep in line with how other ratio schedules are defined it is written using the “FR” abbreviation and so is written as “FR1”.

Comparing an FR1 and an FR2 schedule of reinforcement.

Variable-Ratio Schedule (VR)

When using a variable-ratio (VR) schedule of reinforcement the delivery of reinforcement will “vary” but must average out at a specific number. Just a fixed-ratio schedule, a variable-ratio schedule can be any number but must be defined.

For example, a teacher following a “VR2” schedule of reinforcement might give reinforcement after 1 correct response, then after 3 more correct responses, then 2 more, then 1 more and finally after 3 more correct responses.

Overall there were a total of 10 correct responses (1 + 3 + 2 + 1 + 3 = 10), reinforcement was delivered 5 times and so reinforcement was delivered for every 2 correct responses on average (10 ÷ 5 = 2). As can be seen in the image below, reinforcement did not follow a constant or fixed number of correct responses and instead “varied” and hence the name “variable-ratio” schedule of reinforcement.

A variable ratio schedule of reinforcement. Specifically a VR3 schedule.

Fixed-Interval Schedule (FI)

A fixed-interval schedule means that reinforcement becomes available after a specific period of time. The schedule is abbreviated into “FI” followed by the amount of time that must pass before reinforcement becomes available, e.g. an FI2 would mean reinforcement becomes available after 2 minutes has passed; an FI20 means 20 minutes must pass and so on.

A common misunderstanding is that reinforcement is automatically delivered at the end of this interval but this is not the case. Reinforcement only becomes available to be delivered and would only be given if the target behaviour is emitted at some stage after the time interval has ended.

To better explain this say a target behaviour is for a child to sit upright at his desk and an FI2 schedule of reinforcement is chosen. If the child sits upright during the 2 minute fixed-interval no reinforcement would be given because reinforcement for the target behaviour is not available during the fixed-interval.

If the child is slumped in his seat after the 2 minute interval elapses reinforcement would still not be given because reinforcement is only now available to be given. Just because he emitted the target behaviour (sitting upright) during the interval does not mean reinforcement is delivered at the end of the interval.

Say 10 more minutes pass before the boy sits upright, it is only now that he has emitted the target behaviour and the interval is over that reinforcement would be delivered.

Once reinforcement is delivered then the 2 minute fixed-interval would be started again.

After the 2 minute fixed-interval had elapsed, it could have taken 2 seconds, 10 minutes, 20 minutes, 200 minutes or more until the boy sat upright, but no matter how long it would have taken, no reinforcement would be delivered until he did.

Variable-Interval Schedule (VI)

The variable-interval (VI) schedule of reinforcement means the time periods that must pass before reinforcement becomes available will “vary” but must average out at a specific time interval. Again the time interval can be any number but must be defined.

Following a “VI3” schedule of reinforcement, a teacher could make reinforcement available after 2 minutes, then 5 minutes, then 3 minutes, then 4 minutes and finally 1 minute.

In this example, reinforcement became available 5 times over a total interval period of 15 minutes.

On average then, three minutes had to pass before reinforcement became available (2 + 5 + 3 + 4 + 1 = 15 ÷ 5 = 3) and so this was a VI3 schedule.

Just a fixed-interval (FI) schedule, reinforcement is only available to be delivered after the time interval has ended. Reinforcement is not delivered straight after the interval ends, the child must emit the target behaviour after the time interval has ended for the reinforcement to be delivered.

Interval Schedules: A Tip

A helpful way to think of the interval schedules of reinforcement (both fixed and variable) is to think of the chosen time period as a period of time where no reinforcement would be given for the target behaviour.

Interval Schedules with a Limited Hold

Both fixed-interval (FI) and variable-interval (VI) schedules of reinforcement might have what is called a “limited hold” placed on them. When a limited hold is applied to either interval schedule then reinforcement is only available for a set time period after the time intervals have ended.

For example, using an FI2 schedule with a limited hold of 10 seconds means that when the 2 minute time interval has ended the child must engage in the target behaviour within 10 seconds or the fixed-interval of 2 minutes will start again and no reinforcement would be delivered. The limited hold is abbreviated into “LH” so the example above would be written as “FI2-minutes LH10-seconds” or sometimes maybe “FI2min LH10sec”.

Thinner and Thicker Schedules of Reinforcement

Sometimes you might hear the term “thicker schedule of reinforcement” or “thinner schedule of reinforcement”. These terms are used to describe a change that may be made to a schedule of reinforcement already being used.

For example, if a teaching programme was using an FR10 schedule (reinforcement delivered after every 10 correct responses), then a “thinner” schedule would mean increasing the amount of correct responses needed to earn reinforcement so the amount of reinforcement is reduced or “thinned”. Think of “thinner” in terms of “less” reinforcement. So for example, a thinner schedule than an FR10 schedule might be an FR15 schedule, so the child would now have to get 15 correct responses before earning reinforcement.

A “thicker” schedule would mean decreasing the amount of correct responses needed to earn reinforcement so the amount of reinforcement is increased. Think of “thicker” in terms of “more” reinforcement.

So a thicker schedule than an FR10 might be an FR5 schedule, so the child would now have to get only 5 correct responses before earning reinforcement.

Sometimes the term “denser” schedule of reinforcement might be used to denote a thicker schedule – but these terms mean the same thing.

Thinner and thicker schedules of reinforcement.

Combining Schedules of Reinforcement

Say a teacher is working through a spelling programme with a child and is using a token economy as positive reinforcement on an FR2 schedule of reinforcement; one token (reinforcement) is being delivered for every second correct spelling. So for the first trial, the teacher says “Spell apple”, the child correctly spells the word and the teacher does not give a token…but what does the teacher do? How does the child know if he’s right or wrong?

To combat this, combinations of reinforcement schedules may be used where “verbal praise” is on a continuous (or an FR1) schedule of reinforcement while the token economy is on the FR2 schedule.

So for every correct spelling, the teacher would say something “great job!” or “brilliant!” or “you’re right!” and then every second correct spelling is reinforced with a token as well as verbal praise. In these cases, you would ly see “FR1 praise, FR2 token” written out in the discrete trial script to specify which schedules of reinforcement are being used.

Combining fixed ratio schedules of reinforcement to deliver both tokens and verbal praise for correct responding.

There are also “compound” schedules of reinforcement where different types of reinforcement schedules are combined in various different ways. There is a lot that can be said to describe these schedules and for the sake of this article we will not go into this detail.

  • Discrete Trial Training
  • Positive Reinforcement
  • Token Economy

Skip to the top.

References

  • Cooper, J., Heron, T., & Heward, W. (2007). Applied Behaviour Analysis. New Jersey: Pearson Education.
  • Malott, R. & Trojan-Suarez, E. (2004) Principles of Behaviour. New Jersey: Pearson Prentice Hall.
  • Miltenberger, R. (2008). Behaviour Modification. Belmont, CA. Wadsworth Publishing.

Источник: http://www.educateautism.com/applied-behaviour-analysis/schedules-of-reinforcement.html

Key Takeaways: Reinforcement Schedules

  • A reinforcement schedule is a rule stating which instances of behavior, if any, will be reinforced.
  • Reinforcement schedules can be divided into two broad categories: continuous schedules and partial schedules (also called intermittent schedules).
  • In a continuous scheduleevery instance of a desired behavior is reinforced, whereas partial schedules only reinforce the desired behavior occasionally.
  • Partial reinforcement schedules are described as either fixed or variable, and as either interval or ratio.
  • Combinations of these four descriptors yield four kinds of partial reinforcement schedules: fixed-ratio, fixed-interval, variable-ratio and variable-interval.

In 1957, a revolutionary book for the field of behavioral science was published: Schedules of Reinforcement by C.B. Ferster and B.F. Skinner.

The book described that organisms could be reinforced on different schedules and that different schedules resulted in varied behavioral outcomes.

Ferster and Skinner’s work established that how and when behaviors were reinforced carried significant effects on the strength and consistency of those behaviors.

Introduction

A schedule of reinforcement is a component of operant conditioning (also known as ininstrumental conditioning). It consists of an arrangement to determine when to reinforce behavior. For example, whether to reinforce in relation to time or number of responses.

Schedules of reinforcement can be divided into two broad categories: continuous reinforcement, which reinforces a response every time, and partial reinforcement, which reinforces a response occasionally.

The type of reinforcement schedule used significantly impacts the response rate and resistance to extinction of the behavior.

Research into schedules of reinforcement has yielded important implications for the field of behavioral science, including choice behavior, behavioral pharmacology and behavioral economics.

Continuous Reinforcement

In continuous schedules, reinforcement is provided every single time after the desired behavior.

Due to the behavior reinforced every time, the association is easy to make and learning occurs quickly. However, this also means that extinction occurs quickly after reinforcement is no longer provided.

For Example

We can better understand the concept of continuous reinforcement by using candy machines as an example.

Candy machines are examples of continuous reinforcement because every time we put money in (behavior), we receive candy in return (positive reinforcement).

However, if a candy machine were to fail to provide candy twice in a row, we would ly stop trying to put money in (Myers, 2011).

We have come to expect our behavior to be reinforced every time it is performed and quickly grow discouraged if it is not.

Partial (Intermittent) Reinforcement Schedules

Un continuous schedules, partial schedules only reinforce the desired behavior occasionally rather than all the time. This leads to slower learning since it is initially more difficult to make the association between behavior and reinforcement.

However, partial schedules also produce behavior that is more resistant to extinction. Organisms are tempted to persist in their behavior in hopes that they will eventually be rewarded.

For instance, slot machines at casinos operate on partial schedules. They provide money (positive reinforcement) after an unpredictable number of plays (behavior).

Hence, slot players are ly to continuously play slots in the hopes that they will gain money the next round (Myers, 2011).

Partial reinforcement schedules occur the most frequently in everyday life, and vary according to the number of responses rewarded (fixed or variable) or the time gap (interval or ratio) between response.

Combinations of these four descriptors yield four kinds of partial reinforcement schedules: fixed-ratio, fixed-interval, variable-ratio and variable-interval.

Fixed Interval Schedule

In operant conditioning, a fixed interval schedule iswhen reinforcement is given to a desired response after specific (predictable) amount of time has passed.

Such a schedule results in a tendency for organisms to increase the frequency of responses closer to the anticipated time of reinforcement. However, immediately after being reinforced, the frequency of responses decreases.

The fluctuation in response rates means that a fixed-interval schedule will produce a scalloped pattern (refer to figure below) rather than steady rates of responding.

Variable Interval Schedule

In operant conditioning, a variable interval schedule is whenthe reinforcement is provided after a random (unpredictable) amount of time has passes and following a specific behavior being performed.

This schedule produces a low, steady responding rate since organisms are unaware of the next time they will receive reinforcers.

Fixed Ratio Schedule

In operant conditioning, a fixed-ratio schedule reinforces behavior after a specified number of correct responses.

This kind of schedule results in high, steady rates of responding. Organisms are persistent in responding because of the hope that the next response might be one needed to receive reinforcement. This schedule is utilized in lottery games.

Variable Ratio Schedule

A variable ratio schedule is a schedule of reinforcement where a behavior is reinforced after a random number of responses.

This kind of schedule results in high, steady rates of responding. Organisms are persistent in responding because of the hope that the next response might be one needed to receive reinforcement. This schedule is utilized in lottery games.

Response Rates of Different Reinforcement Schedules

Ratio schedules – those linked to number of responses – produce higher response rates compared to interval schedules.

As well, variable schedules produce more consistent behavior than fixed schedules; unpredictability of reinforcement results in more consistent responses than predictable reinforcement (Myers, 2011).

Extinction of Responses Reinforced at Different Schedules

Resistance to extinction refers to how long a behavior continues to be displayed even after it is no longer being reinforced. A response high in resistance to extinction will take a longer time to become completely extinct.

Different schedules of reinforcement produce different levels of resistance to extinction. In general, schedules that reinforce unpredictably are more resistant to extinction.

Therefore, the variable-ratio schedule is more resistant to extinction than the fixed-ratio schedule. The variable-interval schedule is more resistant to extinction than the fixed-interval schedule as long as the average intervals are similar.

In the fixed-ratio schedule, resistance to extinction increases as the ratio increases. In the fixed-interval schedule, resistance to extinction increases as the interval lengthens in time.

the four types of partial reinforcement schedules, the variable-ratio is the schedule most resistant to extinction. This can help to explain addiction to gambling.

Even as gamblers may not receive reinforcers after a high number of responses, they remain hopeful that they will be reinforced soon.

Implications for Behavioral Psychology

In his article “Schedules of Reinforcement at 50: A Retroactive Appreciation,” Morgan (2010) describes the ways in which schedules of reinforcement are being used to research important areas of behavioral science.

Choice Behavior

behaviorists have long been interested in how organisms make choices about behavior – how they choose between alternatives and reinforcers. They have been able to study behavioral choice through the use of concurrent schedules.

Through operating two separate schedules of reinforcement (often both variable-interval schedules) simultaneously, researchers are able to study how organisms allocate their behavior to the different options.

An important discovery has been the matching law, which states that an organism’s response rates to a certain schedule will closely follow the ratio that reinforcement has been obtained.

For instance, say that Joe’s father gave Joe money almost every time Joe asked for it but Joe’s mother almost never gave Joe money when he asked for it. Since Joe’s response of asking for money is reinforced more often when he asks his father, he is more ly to ask his father rather than his mother for money.

Research has found that individuals will try to choose behavior that will provide them with the largest reward. There are also further factors that impact an organism’s behavioral choice: rate of reinforcement, quality of reinforcement, delay to reinforcement and response effort.

The blog Babble behavior summarizes the findings well: “Everyone prefers higher amounts, quality, and rates of reward. They prefer rewards that come sooner and requires less overall effort to receive.”

Behavioral Pharmacology

Schedules of reinforcement are used to evaluate preference and abuse potential for drugs. One method used in behavioral pharmacological research to do so is through a progressive ratio schedule.

In a progressive ratio schedule, the response requirement is continuously heightened each time after reinforcement is attained. In the case of pharmacology, participants must demonstrate an increasing number of responses in order to attain an injection of a drug (reinforcement).

Under a progressive ratio schedule, a single injection may require up to thousands of responses. Participants are measured for the point where responding eventually stops, which is referred to as the “break point.”

Gathering data about the break points of drugs allows for a categorization mirroring the abuse potential of different drugs. Using the progressive ratio schedule to evaluate drug preference and/or choice is now commonplace in behavioral pharmacology.

Behavioral Economics

Operant experiments offer an ideal way to study microeconomic behavior; participants can be viewed as consumers and reinforcers as commodities.

Through experimenting with different schedules of reinforcement, researchers can alter the availability or price of a commodity and track how response allocation changes as a result.

For example, changing the ratio schedule (increasing or decreasing the number of responses needed to receive the reinforcer) is a way to study elasticity.

Another example of the role reinforcement schedules play is in studying substitutability by making different commodities available at the same price (same schedule of reinforcement). By using the operant laboratory to study behavior, researchers have the benefit of being able to manipulate independent variables and measure the depending variables.

Mini Quiz

Below are examples of schedules of reinforcement at work in the real world. Read the examples and then determine which kind of reinforcement schedule is being used.

Annabelle Lim is a second-year student majoring in psychology and minoring in educational studies at Harvard College.

She is interested in the intersections between psychology and education, as well as psychology and the law.

How to reference this article:

Lim, A (2020, July 02). Schedules of reinforcement. Simply Psychology. www.simplypsychology.org/schedules-of-reinforcement.html

APA Style References

Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. New York: Appleton-Century-Crofts.

Morgan, D. L. (2010). Schedules of Reinforcement at 50: A Retrospective Appreciation. The Psychological Record; Heidelberg, 60(1), 151–172.

Myers, David G. (2011). Psychology (10th ed.). Worth Publishers.

What Influences My Behavior? The Matching Law Explanation That Will Change How You Understand Your Actions. (2017, August 27). Behaviour Babble. https://www.behaviourbabble.com/what-influences-my-behavior/

Psychologydo
Добавить комментарий

;-) :| :x :twisted: :smile: :shock: :sad: :roll: :razz: :oops: :o :mrgreen: :lol: :idea: :grin: :evil: :cry: :cool: :arrow: :???: :?: :!: