In behavioral science we like to look at things that are concrete and observable. Why do people respond to specific scenarios and stimuli in different ways? How do they differ from one another? How can we adapt what we present in ways that either increase or decrease a person’s responding? These are questions we can apply to our area of interest; Video Games, in order to explore what game designers have put in to their medium to get you hooked and keep you hooked. Video Games require the audience to participate in ways that other art mediums do not. It is the direct responses of the consumer that shape and define their progress through the game and a hallmark trait of video games is using rewards as marks of progress that get people to play longer, increase their own skill at the game, and master objectives that the designers put in place. Let’s discuss some of the behavioral principles that may in play with the games you know and love. See if you can identify these concepts in your own experiences with video games.
Reinforcement vs. Rewards
In behavioral science, we use the word reinforcement to define a consequence that strengthens a future behavior, when presented with the same setting/stimulus (antecedent). When a reinforcer is presented after a behavior, we expect to see the probability of that behavior to go up the next time the person is placed in that situation. It is the foundation of learning and operant behavior. Operant Behavior is a large piece of this conceptual puzzle; it is behavior that has been shaped to serve a purpose in the environment, which has been reinforced in the past. How does this differ from rewards? In gaming of all types, there are rewards. These are pre-set consequences or prizes that follow the completion of specific objectives laid out for the player. Some prizes/rewards are interesting to a player and keep them engaged with the game, and others do not, leading to disinterest or a falloff in responding (playing). What makes a reinforcer different from a reward, is that reinforcers are dependent on the individual’s future responding. When we say reinforcer, we are saying with a degree of certainty that this “reward” has effected behavior before and is preferred by the individual, because it has been shown to have worked in the past. Let’s look at this scenario:
Player 1 must press the circle button when presented with a box in order to break the box and gain a prize (100 points).
If Player 1 presses the circle button and breaks the box, and gets the 100 points, they have been “rewarded”.
If Player 1 presses the circle button and breaks the box, gets the 100 points, and presses the circle button when presented with more boxes in the future, they have been reinforced.
It could be said that 100 points was enough to reinforce the behavior. This effects future playing behavior by pairing a preferred stimulus (the points) with an operant behavior (pressing the circle button) in the presence of the box (antecedent). This is also called the Three Term Contingency.
If game designers want their players to learn certain skills specific to their game, or keep people playing it, they need to focus on casting the widest net of reinforcers, rather than just rewards. Anything can be a reward, but only when it’s considered a reinforcer, will we see players use those skills to progress again and again.
Schedules of Reinforcement:
In the example above, we have a single situation, with a single reinforcer. Games are made up of varied scenarios, competing choices for the player to take, and sometimes we see two types of reinforcement used at the same time. How does that work? Sometimes a player is presented with an opportunity to complete two objectives at the same time. This brings a level of challenging complexity that most players enjoy more than a simplistic single system of reward, because it raises the stakes in terms of what they can receive. Let’s take a look at some simple schedules of reinforcement below:
Fixed Ratio Reinforcement:
In this schedule of reinforcement, we see a set rate of responding met with a set amount of reward. So if a player beats 1 adversary and receives 200 points, this is called a FR1 (fixed-ratio 1) ratio. If a player needs to beat 2 adversaries to receive 200 points, this is called a FR2 ratio, and so on. The benefit of this style of reinforcement schedule is that it is consistent and a player can depend on it. If they can predict the amount of points/rewards they receive for each action, they can match their responding to the amount of reinforcement which satisfies them.
Variable Ratio Reinforcement:
Some people know this schedule of reinforcement from RNGs (Random Number Generators) that are put in games to provide variability, and also for some people, a very strong system of reinforcement. Gambling also runs on this principle. With variable ratio, there is percentage that the response will be rewarded. Unlike the Fixed Ratio, prediction of the reinforcer does not follow a fixed series. The Player must rely on chance, or repetition of responses (for more opportunities) in order to receive a reward. Sometimes this can come in the form of an increase in magnitude of the reward (an adversary sometimes is worth 100 points, but may also be worth 500), or frequency (some adversaries reward points, others do not). As we may expect, the chance to receive a large reward for a standard amount of effort can be a very reinforcing contingency.
Looking at these two schedules, we can expect that both have their respective fans. Some players prefer predictability and something that can be planned for. A specific amount of successful responding would equal an expected amount of reward, every time (Fixed Ratio). Others, enjoy the variability; sometimes even a standard amount of responding could pay off in a huge reward (Variable Ratio). When we combine two or more simple schedules, we get the complex schedules:
If you give the player the option between a Fixed Ratio and a Variable Ratio, we call this a concurrent schedule of reinforcement. It would look something like this:
If a player walks down path A to fight the goblins, they can expect 100 points for each goblin adversary beaten, but if the player goes down path B to fight the birds, there is a variable chance of getting 800 points for each bird beaten. Both of these options are available and do not necessarily reduce the option of pursuing the other. A player could fight the goblins for a little while, then choose to fight the birds. The options are both available, thus concurrent. You see these schedules of reinforcement common in games that allow for free exploration, or multiple avenues to the same objective.
If we give the player both a Variable Ratio and Fixed Ratio at the same time, we call that a superimposed schedule of reinforcement. It would look something like this:
A player is set in a scenario where they had to face both goblin adversaries and bird adversaries at the same time. Each goblin adversary that they beat would reward them 100 points (Fixed Ratio), and each bird adversary beaten would give a chance of getting 800 points (Variable Ratio). These two schedules are now running at the exact same time, and the player has the opportunity to pursue each simultaneously.
These are just a few examples of the type of reinforcement schedules you may come across in games. There are no real limits to how many schedules of reinforcement may run concurrently or superimposed. You could run multiple fixed intervals at the same time (An orange is worth 100 every time, an apple is worth 200 points every time), multiple variable ratios (An orange is sometimes worth 100 points, an apple is sometimes worth 200 points). The possibilities are limitless. There even exist schedules of reinforcement that rely on intervals of time, rather than responding (every 3 minutes you receive 100 points, or sometimes every 10 minutes you receive 100, regardless of what responding the player is engaged in).
It stands to reason, however, that the more schedules which run at the same time, and the more complicated the contingencies of reinforcement, the greater the risk that the player will not understand what responses or choices are actually being reinforced. This may lead to some misattribution, or superstitious responding (responding that has been reinforced by a contingency that did not actually exist). When reinforcement schedules are too complex or not clear, they can create confusion with the players, and result in loss of responding or interest in the game.
Complications:
Human behavior is not always easily predicted, and even in video games, game designers can create vast systems of intertwined schedules of reinforcement that keep players enthralled for hours, but there may come a point where the expectations of player responding do not match the predictive models. We have to be aware of some of the other factors in behavioral science and research that influence a decrease in responding (playing) or disinterest. Below are just a few of these that we commonly come across in video games.
Punishment: Punishment is a condition where a stimulus is either presented or removed that decreases the probability a behavior would happen in the future. It serves the opposite purpose of reinforcement. It comes in two variations; positive and negative. These terms do not reflect anything “good” or “bad” but rather an addition or subtraction of stimuli which has a marked effect on the decrease of future behavior when given the same (or similar scenarios). In video games, they look something like this:
• Positive Punishment: A player walks in to a hole. That player receives damage. The hole is the presentation of a stimulus, and assuming damage is aversive to this player’s style or goals, they would be much less likely to walk in to it again.
• Negative Punishment: A player buys an overpriced item in an in-game shop. Assuming the player has lost a significant amount of something that was preferred in exchange for something non-preferred, they are not likely to repeat the buying behavior in the future.
S Δ (S-Delta): S-Delta shares a similarity with Punishment in that it does not strengthen or reinforce a behavior or series of responses. An S-Delta is a stimulus that when present, a particular behavior receives no reinforcement. An example of this might be, if a player is used to running down a path to pick up items/points, the hold down the “Run” button to increase their reinforcement. However, if this same behavior was attempted when in the presence of a wall (S-Delta), that behavior of holding the “Run” button would not receive the same reinforcement. Running behavior is not necessarily punished overall, but it is less likely to be used for reinforcement in the presence of the wall.
Ratio Strain: Ratio Strain is a condition where an increase in response is expected, but the reinforcement is not enough to maintain it. An example of this may be, if a player is used to defeating goblins for 100 points, but is then presented with Super Goblins rewarding 100 points which are much more difficult to defeat, the amount of reward is no longer reinforcing enough to maintain the repetition of responding. This can often be solved by raising the amount of reinforcement to match the effort.
Satiation: Satiation is a common modifying condition for human behavior. There comes a point when a specific reinforcer is acquired to the point where it is no longer a reinforcer anymore. An example of this is, if a player is satisfied with having 10,000 points, and achieves 10,000 points, any future accumulation of points would not reinforce the behavior to continue. The reward condition would remain, but it would no longer be considered reinforcing. This may often be solved by allowing some time to pass to the point where that satiation condition is no longer present, or changing reinforcers.
Response Effort: It is the amount of effort a person has to put forward to complete a target behavior. This is not a barrier to playing in itself, but could denote a change in difficulty. So if we are reinforcing the behavior of defeating ghosts or eating dots, the amount of effort may be how fast a person has to respond to obstacles, or the amount of fine motor skill necessary to navigate to the objective. If the amount of effort exceeds what a player can respond to, we can say the response effort has been set too high to be reinforced.
The Social Factor
We would be remiss in ignoring one of the strongest forms of reinforcement that may not necessarily be provided in the game, but the product of success or even the pursuit of playing could give us; social reinforcement. Sometimes players enjoy the thrill of competition (competitive multiplayer), others enjoy jolly cooperation (cooperative multiplayer). Many find strong reinforcement in sharing their experiences (streaming), or showing off completed objectives (trophies/completion). Bringing other people in to the experience of interacting with video games is by no means a new prospect, but quantitatively measuring social reinforcement in video games is still very much an avenue of research worth pursuing. Some examples that game designers may be able to follow to collect that data may be; how many times multiplayer aspects are utilized, the duration of multiplayer aspects to their game, viewership in streamed media, and of course, consumer demands for specific social aspects that would be feasible in a game. There may also be examples where developed games rely too much on external social reinforcement without providing sufficient contingencies of their own within the game’s design.
Balancing it all
Video Games are rich examples of how human behavior interacts with digital entertainment, and the concepts above are just the tip of the iceberg. Some games employ one or two of those concepts, others employ complex systems of intentional reinforcement and punishment. With different generations we have seen popular features rise and fall but all seem to follow the basic principles; objectives, responses, and rewards. Reading this, you may have some ideas on some other phenomena that might have an effect on the relation between video game and human. The concepts above is in no way exhaustive, but it’s a topic we may be able to explore a deeper in the future. Leave comments below with your thoughts, theories, and opinions.
References:
- Fantino, Edmund; Romanowich, Paul. (2007) THE EFFECT OF CONDITIONED REINFORCEMENT RATE ON CHOICE: A REVIEW. Journal of the Experimental Analysis of Behavior
- Magoon, Michael A; Critchfield, Thomas S. (2008) CONCURRENT SCHEDULES OF POSITIVE AND NEGATIVE REINFORCEMENT: DIFFERENTIAL-IMPACT AND DIFFERENTIAL-OUTCOMES HYPOTHESES. Journal of Applied Behavior Analysis
- Pietras, Cynthia J; Brandt, Andrew E; Searcy, Gabriel D. (2010) HUMAN RESPONDING ON RANDOM-INTERVAL SCHEDULES OF RESPONSE-COST PUNISHMENT: THE ROLE OF REDUCED REINFORCEMENT DENSITY. Journal of Applied Behavior Analysis
- Pipkin, Claire St Peter; Vollmer, Timothy R (2007). APPLIED IMPLICATIONS OF REINFORCEMENT HISTORY EFFECTS. Journal of Applied Behavior Analysis.
- Skinner, B. F. (1953). SCIENCE AND HUMAN BEHAVIOR. New York: Free Press.
- Skinner, B.F. (1938). THE BEHAVIOR OF ORGANISMS. D. Appleton & Company.
Photo Citations:
- “Dark Souls 3” – Ethan Russel
- “Mario” -Freeimages.com
- “Pacman”- Freeimages.com
- “Arcade”-Freeimages.com