© RDA 17 March 2002

 

Targeting, Clicker-Bridging, and Positive and Negative Aspects of Horse Training

 

Richard D. Alexander and Cynthia Kagarise Sherman

 

RDA Note: Dr. Cynthia Kagarise Sherman, my former doctoral student at the University of Michigan, and also my good friend, trains dogs in Ithaca, New York, and has a special enthusiasm for agility competition. She and I combined our slightly different but complementary knowledge and experience to produce this essay. ­ RDA.

 

Think about an animal -- any animal -- in either its natural habitat or a training situation. At any time it is likely to face one of three situations (or stimuli): positive, negative, or neutral. Positive situations are generally defined as those the animal seeks to create, enter, or repeat. Negative situations are those it seeks to avoid or not repeat. Neutral situations are those it discovers (eventually) to have essentially no significance for it. Everybody already knows these things, but they are a useful way for us to begin.

To think about these situations (or kinds of stimuli) draw two lines down a sheet of paper, creating three equal-sized vertical columns on the paper (or just imagine doing this). Then write "negative" in the left-hand column, "neutral" in the middle one, and "positive" in the right-hand one. Now draw a horizontal line below the three words just added, and below the line place these words (left to right), "punishing" (or aversive), "neutral," and "rewarding." Each of these words can indicate either a general situation or a specific stimulus. We will continue to refer to this diagram.

In our human eyes, a rewarding or positive situation for a horse might be moving quietly with the familiar herd of its life on a warm sunny day in an open field filled with quality grass, and with not the slightest evidence of a predator or other enemy anywhere. Or it might be a warm, quiet stall on a blustery winter evening, well bedded in clean straw, with a tray filled with oats and a manger full of bright grass hay, and located just next to the horse's best herd buddy. Or it could be setting out on an adventurous ride up into the hills with the horse's favorite skillfully gentle human riding on its back after grooming it thoroughly and effectively just before saddling and bridling it gently and quietly.

Again, in our eyes, punishing, aversive, or negative situations are probably more or less the opposites of those above: being chased at top speed by a murderous predator; being wretchedly hungry and cold with no food or shelter available; being separated from the herd in a dangerous location while pelted by a steady cold rain. Or: being ridden by a human who believes that repeated jabs with spurs, slaps with the ends of the reins, or jerks against the bit are the only ways to keep a riding horse on its good behavior; or by a human who failed to clean dry caked mud or manure from the part of the chest where the cinch is tightened; or one who does not realize that the bit is faulty, pinching the horse's mouth or tongue every time it shifts in position; or one who did not notice that the horse has a sharp pebble caught in a corner of its hoof.

What about neutral situations and stimuli? These would be circumstances that are neither rewarding nor punishing. In effect, neutral stimuli mean little or nothing to the horse, once they have been examined and discovered to involve neither positive nor negative stimuli, neither promises of rewards nor threats of punishment or unpleasantness. Once such assessment has been made by the animal, neutral stimuli will surely fade from its attention. Before a stimulus or situation has been so considered and judged by an animal, a predator such as a dog or cat might view the stimulus as a potential source of food and investigate rather aggressively. A prey animal, such as a horse, on the other hand, is likely to treat a new or poorly understood situation (that might actually turn out to be neutral) as dangerous, and either avoid it or else investigate cautiously, ready to bolt in an instant. In effect, a prey animal may frequently be best off treating new situations or stimuli as negative. And we all know they tend to do that.

Let's think a bit further about what horses that find themselves in these three kinds of situations are likely to do. A horse in a punishing or negative situation will surely attempt behaviors that remove it from the situation and bring it into a neutral or a rewarding situation. Likewise, a horse in a neutral situation will surely move into a rewarding situation if possible.

But we need to take this idea further. Thus, it is a reward not only to receive a rewarding stimulus but to replace it with a more rewarding one. It is a reward not only to remove an aversive stimulus but to replace it with a less aversive one.

It is a punishment not only to receive an aversive stimulus but to have it changed into a more aversive one. It is a punishment not only to have a rewarding stimulus removed but to have it changed for a less rewarding one.

We know these things are true for ourselves. They seem also to be true for horses, and there does not seem to be any reason for doubting that this is generally so.

So we can draw a long arrow from left to right on our picture of different kinds of situations or stimuli. This arrow depicts the way in which an animal such as a horse will respond to situations or stimuli at various locations in the picture. No matter where the horse starts, it will always seek to move itself to the right ­ toward reductions in punishing or aversive stimuli, then toward rewards, and finally toward increasing rewards.

This last realization is critical to understanding animal training. It tells us that for training purposes we can almost eliminate the three columns in our minds. Neutral stimuli will always tend to be converted to positive or negative ones (by discovery or accident), and if neither happens, to be ignored or forgotten. Negative stimuli will be dealt with so as to lessen or remove their negative aspects, or to replace them with neutral or rewarding stimuli. But, because every reduction of aversiveness in negative stimuli constitutes some kind of reward, the line between punishment and reward becomes blurred. It is further blurred by the realization that precisely the same trend is created by the animal even after it has moved entirely into what we would interpret as the reward column: it continues to seek ever more rewarding situations or stimuli (for example, see Teaching Yourself to Train Your Horse ( TYTYH), p. 40, Competing Natural Stimuli).

 

For most training situations, then, we can simplify our picture by keeping it in our minds that any horse, anywhere, any time, will seek the least aversive or most rewarding situation it can identify. Reducing or removing aversiveness thus becomes rewarding. Everything we think about training should begin with this realization. To be most effective in training we must create situations that the horse can turn into a system of more or less continually increasing rewards for itself ­ not merely rewards of just any intensity or quality, but rewards that are minimally of sufficient quality and intensity to be sought by the horse in its current situation, and ideally of sufficient quality and intensity to be sought by the horse no matter its situation. And we must do this in such a way that the horse identifies and seeks rewards which cause it to do the precise things we wish it to do to serve our interests.

Now let's talk about trainers who use clickers to condition an animal and then work back to what we have just been saying. Clicker trainers may watch a horse until it carries out the first step in a desired behavior sequence, click, and then promptly feed the horse; or they may induce a behavior, click when it is performed, and immediately feed the horse. As soon as the clicker becomes for the horse a reliable indicator of an imminent reward -- in any situation -- the horse can be successively conditioned to do almost anything the perceptive trainer wishes. The beauty of the clicker, as a "bridge" between behavior and the trainer's approval (reward), is that it easily provides highly accurate timing of the signal to the horse that it is going to be rewarded. It also allows training "by rewards alone." But we'll need to come back to this last point.

Since accurate timing is probably the most difficult problem for trainers, there is something special to be learned from clicker training. For that reason alone we believe every horse trainer can benefit from reading a book like You Can Train Your Horse to Do Anything. On Target Training: Clicker Training and Beyond, by Shawna and Vinton Karrasch, published by Trafalgar Square Publishing, North Pomfret, Vermont, 2000 (for the importance of timing in training, see also TYTYH, pp. 10-12, 40, 49-50). Nevertheless, we think horse trainers need to understand how their current training methods compare with those of "clicker training," and how to recognize similar or identical procedures in the two systems.

Trainers who use clickers may begin by providing an object (target), such as a ball on a stick, which the animal has a tendency to touch with its nose. As soon as this happens, the trainer immediately gives a click with the clicker, then rewards the animal with food. This sequence is easily established if the animal already has a tendency to touch objects with its nose (e.g., belongs to a species that uses odor extensively). Eventually, the clicker becomes generalized as a "bridging stimulus" announcing a desirable behavior, therefore also announcing an impending reward for any and all behaviors the trainer wishes to reinforce. The clicker itself becomes a reward, so long as it consistently promises a "real" reward such as food. When this situation is achieved, the clicker not only facilitates precise timing of rewards, but also speeds up the connecting of a reward to whatever response might be desired by the trainer.

Horse people use "clicker surrogates" all the time, even though they may not have heard them called "bridges" (e.g., TYTYH, pp. 43-47). As a result they may not immediately see the connections between clickers as bridges and their own training methods. For example, a kind word, or what is called "positive talk" in TYTYH, can become a generalized bridging stimulus or "clicker surrogate" if it is consistently used to announce the imminence of a reward such as scratching, or relief from an aversive stimulus or situation. Even "leaving the horse alone" (TYTYH, p. 47) can become a bridge if its start consistently indicates that the horse just did something right (e.g., it stepped over when asked verbally, or when a leg was applied to its side). The horse's reward in this case doesn't have to be some "other" (different) positive stimulus: it can be merely the expectation or "promise" of a continuation of its being left alone, rather than being subjected further to whatever aversive or punishing stimulus (e.g., rein or leg signals) might previously have been used to change its behavior.

Clicker training as such may be most appropriate for animals that are trained entirely from the ground, because the trainer not only can place himself or herself in any position in relation to the animal as it does what is being asked, but also will have little difficulty in carrying a target and a clicker and a box of feed -- and will be easily able to dip into the box of feed and present it to the animal whenever necessary. As might be predicted, 97 of the Karraschs' 106 illustrations of a horse and at least one person are of a trainer working with a horse from the ground, and in all but 23 of the 97 without a lead rope or reins. These pictures may initially cause horse trainers to marvel that the trainer can operate without halter and lead rope, or other physical restraints. In fact, clicker training is most convenient when the standard physical restraints of horse training are not in use. Thus, in 8 of 9 pictures of the Karrasches training to mount or jump, two people are cooperating.

None of the above comments is a criticism, but all together they have relevance when considering the entire training sequence for a riding horse. Thus, to train a riding horse, and to guide and use one, the trainer obviously has to mount and ride the horse. In this situation (or even when merely leading a horse), a target, clicker, and box of feed are cumbersome to use, assuming the mounted trainer has only two hands, and arms too short to reach the horse's mouth handily from the saddle. Even a whistle as a substitute for a clicker can be maddeningly inconvenient in training a ridden horse to do a variety of things across a long working day.

Nevertheless, any trainer can use the clicker method to good advantage when working from the ground, and can learn from it how to train more effectively when riding, whether or not a clicker is actually used for bridging. Thus, seemingly large benefits of target and clicker training are that (1) initial training steps can proceed entirely from rewarding stimuli, (2) how to time rewards and their indicators can be learned quickly, and (3) the horse can learn quickly, from rewards, when it has done whatever the trainer desires. Learning from rewards is what the horse seeks to do in all situations in which it is learning on its own, so the accurate timing of a clicker can assist greatly in gaining the horse's trust. The animal can more quickly be induced to enjoy training, and all its interactions with the trainer -- to realize the admirable goal described by Xenophon in 400 B.C.: "Young horses should be trained in such a way that they not only love their riders, but look forward to the time they are with them." (see p. 10, TYTYH). In this sense clicker training is entirely consistent with the TYTYH theme of maximizing rewards and minimizing punishment. But it is not the only way to achieve harmony between horse and trainer.

In addition to the problem of how to carry and use the necessary equipment, a question arises about applying clicker training, and wholly reward-reinforcement, in training the ridden horse. To understand it consider clicker training for, say, dogs in agility competitions. In this situation the trainer is always on the ground, guiding and directing the dog, visible to it, and not typically giving signals by touching it. In contrast, the rider on a horse is largely out of the horse's sight, yet in intimate physical contact with it through the seat in the saddle, the legs and spurs, and the reins. In this situation timing by touches and pressure can be accomplished with precision, even if only through greater effort and perceptiveness than with a clicker or whistle. It is surely not surprising, however, that the riding horse trainer is essentially restricted to generating desirable behavior on the part of the horse by using what begin as aversive, negative, or punishing tactual signals. This "punishment" may be so light, even from the beginning, that some think that punishment is the wrong word to describe it. But the training method is not based simply on providing rewards, at least not at first, no matter how much we might wish it to be so. Instead it begins with an aversive or punishing stimulus, such as rein or leg pressure. This is the method described by the Karrasches as "not anywhere near as useful as reward-reinforcement." In evaluating this statement, however, we should consider that the method of providing and then removing an aversive stimulus is least effective when the trainer is just beginning it, because that is when the maximum strength of stimulus is required. It is most impressive, and convenient, later on. The reverse is probably true for target and clicker training, at least when "later on" involves a horse being ridden at speed in complex maneuvers.

Given all these considerations, it is difficult to believe either that horse people are using a mode of training that is fundamentally inferior, or that there is an obvious different method that could be used more effectively.

However horse training begins, horses and their riders eventually do absolutely extraordinary things together. They race at breakneck speed, change directions, adjust speed, shift between gaits, twist and turn, dash and slip sideways, attack and retreat; and they can do these kinds of things incredibly quickly. Given the amazingly intricate and rapid successions of things people do with horses, is it too simple to think that horse people are merely aversively conditioning their horses all the time?

In TYTYH it is suggested (p. 223) that the potential number of distinctive signals a rider can give to his or her horse is huge, certainly in the thousands. How can large numbers of aversive signals be given in rapid and complicated combinations while racing at breakneck speed, and the horse respond as if it were at every moment waiting for the rider's wishes and then doing its absolute best to fulfill them?

Somehow, the training of a top riding horse -- although it may necessarily begin with stimuli that are accurately classified as punishing, aversive, or negative -- has to be transformed into something both horse and rider enjoy. Somehow the best trainers seem able to transform it into a positive experience ­ to move the training arrow to the right as training proceeds. We think horse people can profit by trying to discover how that happens.

Maybe the key in comparing "all-rewards" and "maximizing rewards" training systems lies in returning to the point that organisms do not merely seek to remove stimuli that are uncomfortable, aversive, painful, or punishing. They also seek to remove rewarding stimuli when those stimuli can be replaced with even more rewarding stimuli.

Can a horse trainer induce in the horse the effect of, "If you do this I promise you'll be rewarded -- in a truly positive way," using nudges and squeezes and touches on the rein that are in fact mild versions of aversive stimuli? Can reinforcement that begins with punishment in effect be turned a system of rewards without changing the nature of the stimuli, only their intensity?

We suggest that the signals a first-rate rider gives to his or her horse do indeed gradually become positive signals ­ that they become part of a regime of reward reinforcement ­ of guidance rather than forced avoidance; and that they at least produce actions that are elicited in a reward-reinforcement manner. There does not seem to be any other way to explain the fluid speed and harmony of the interactions of a good horse and a good rider.

Consider two people dancing -- or, for that matter, doing anything that involves long-term intimate and sequential physical cooperation with their two bodies. Surely no one would claim that a dancer guiding a partner is using an inferior method of training, even though stimuli are being produced that call for the responding individual to relieve pressure -- turn off the pressure or the stimulus -- in precisely the way that it happens in horse training. It seems to us that the superior horse trainer not only times his or her signals precisely, but also reduces their intensity as the horse learns, so that eventually horse and rider are, in effect, "dancing" together, in this sense cooperating completely and intricately -- often with extremely rapid directional movements caused by quick changes in extremely mild and brief signals.

On p. 31 in TYTYH, it is said that ". . . a signal that is so mild as to approach invisibility cannot be expected to work because it hurts, but only because the rider has already taught the horse something very special about what that signal means." A horse's quick and perhaps even enthusiastic responses to the mildest possible signals mean that something important has previously passed between the rider and the horse, which makes the signals positive rather than negative.

Can this happen because a rider's tactual signals become like encouraging voice signals, therefore positive or rewarding stimuli? Is it a part of the horse's increasing trust that the rider will not betray it with forceful or painful actions -- especially unexpected pain or force?

At any given time during a horse's lifetime of training, its rider must tend continually to move to the right the arrows depicting the place of training efforts on our mental diagram. They move these arrows toward increasingly rewarding situations and stimuli, by reducing the intensity of aversive signals and by creating training and life situations in which the horse is immediately rewarded for its responsiveness to the trainer's signals by being allowed or caused to do something that it perceives as rewarding -- or more rewarding than what it was doing immediately before. The horse helps itself to make these same changes by responding to ever lighter signals, so that strongly aversive signals simply disappear from the horse-rider interaction.

During training the necessary and sufficient signals go from "telling" to ever-milder "asking," and there arise as well what might be called introductory "alerting" signals, which tell the horse a signal is about to be applied, and sometimes predict for the horse the nature of the signal. Every experienced horse person knows that it is virtually impossible to keep from transmitting to a well trained horse what you are planning or expecting to do next, causing many people to believe that horses can "read their riders' minds." Instead the horse is reading the rider's body, and thereby his or her intentions. For example, the rider about to set his or her horse into a fast gallop from a quiet standing position makes certain tiny and imperceptible movements in the tension and positioning of his body, and with his or her hands on the reins, that will communicate this intent to the horse. When this happens in its best form, the eventual signal can indeed become so light as to be introduced "as with an eye dropper" or to become "slight, unobtrusive, and graceful" (quoted from Jack Brainard and Henry Wynmalen, p. 31, TYTYH). Detecting an alerting signal, hence causing subsequent signals to become milder and briefer, is a reward that the horse is able to create for itself. Charles O. Williamson, in Breaking and Training the Stock Horse (1950, p. 9) makes the concept of alerting signals explicit when he says, "A squeeze from the legs should be the first indication for any movement, even backing" (see Backing, TYTYH, p. 229).

We think all of this happens because the good rider's signals are so consistent, and the interaction of horse and rider overall so mild and pleasant -- and maybe positively adventuresome as well -- that the horse comes to trust the rider thoroughly. This means that the horse becomes positive even about being gotten in, groomed, and saddled ­ that whatever the horse and rider do together becomes a series of increasing rewards to the horse as well as the rider. Under the tutelage of a first-rate rider-trainer, the horse rewards itself ever more strongly, in a form of operant conditioning, by detecting and responding appropriately to ever milder, briefer, more effective signals.

Think about a horse moving along a trail, working cattle, or performing a reining or dressage pattern. The situation will be most rewarding to the horse if it always knows what change the rider is about to ask for, and can respond before a substantial signal is given. The well-trained horse has learned that if it catches on quickly enough, it can continue without being plagued by pressure, or by any aversive signals. It has learned to sense the rider's intent almost before the rider does, because that is the way it has best rewarded itself. Only when the rider is also first-rate, however, can this degree of harmony and efficiency be achieved.

From this we are better able to appreciate the significance of Henry Wynmalen's first secret of the art of riding, paraphrased in TYTYH (p. 10): Seek always to detect the lightest possible cue to which the horse will respond. Afterward, seek continually to obtain responses to still lighter cues. When we follow Wynmalen's advice, we are helping our horses move their own training more emphatically into the "rewards" column. We are helping them generate their own motivation. We may not be using an "all-rewards" system, but we can maximize rewards and minimize punishment.

There is another point here. During the training sequence for any ridden horse, the first-rate trainer may necessarily begin on the training arrow more or less on the left (negative) side of the diagram we have constructed in our minds. The horse trainer, however, uses many different kinds of rewards, while moving along the arrow steadily to the right (positive) side as the training of the horse advances. Food is not as likely to be used ­ at least not alone -- as in, say, dogs. Dogs may better understand food as a social reward because food is normally passed between dogs in a pack, and from parents to puppies, as regurgitates or prey carried in the mouth. For a ridden horse, food is not merely difficult to use as a reward: unlike dogs, horses do not pass or share food among themselves, and access to food is not as obviously "granted" among horses -- as it is in dogs -- as rewards for certain kinds of social behavior (except in the sense that mere acceptance into a herd, therefore the privilege of grazing where everyone else does, indirectly constitutes such "granting"). It is thus not surprising that horse people are more likely to use scratching or grooming, and positive talk, as rewards. Horses scratch or groom one another a good deal, and they also use an amazing variety of what appear to be strongly positive and negative kinds of "talk."

When using food as virtually the only training reward, a trainer also has to be careful not to create "pushy" or "impolite" animals that seek food aggressively after every small action. We compare this situation to when human parents reward children for suitable performance of ordinary tasks or responsibilities (e.g., obtaining good grades or doing daily "chores") with the same, instant, on-the-spot rewards, such as money -- in effect, bribes. Children so treated are probably more likely to judge the advisability of performing any costly act by considering whether or not the immediate reward system is in place and likely to meet their demands, and less likely to develop tendencies to find their own rewards -- less likely to become self-motivated, and to generate "inner" satisfaction from their own performance. Perhaps something parallel to such inner satisfaction occurs as a result of the horse's expanding trust of the trainer and understanding of his or her signals. The ability of the horse and rider together continually to reduce the intensity and duration of guiding signals may be possible because the horse is being encouraged to do its trainer's bidding not merely willingly, but perhaps with enjoyment and enthusiasm, in positive adventures of the kind perhaps induced more easily in dogs because of the particular nature of their sociality in the wild, as predators operating in cooperative packs that include only a few reproductive individuals, the rest functioning as helpers. Dogs easily accept the role of helper to human associates, and, of course, we humans have elaborated this tendency by, as Charles Darwin put it with typically astonishing insight, selecting them to return our favors with interest. Horses have no highly cooperative background involving helpers. Perhaps the most they can do in the same direction as dogs is to become trusting and phenomenally useful co-adventurers. All certified horse freaks, maybe especially those who are also lovers of dogs, will be quite happy with this characterization of their favorite horse companions of any moment.

Generally speaking, good horse training is teaching a horse to do, with a rider on his back, the things he does naturally when free and playing in the pasture . . . . All real trainers of saddle horses work above all things for the fundamental principle of lightness in their horses, even if they do not call it by that name. After this is obtained, horses may be taught almost anything.

-- Charles O. Williamson (1950), Breaking and Training the Stock Horse.

 

<< Back to list