### Strategic Decision Making

Game theory is the study of strategic decision making. The name is appropriate, because game theory uses simple models that view strategic interactions much like parlor games.

The essential feature of strategic decision making is that it involves a system of interdependent decisions. Each decision maker (or player) in the decision making process has a menu of strategies. However, the outcome of the process depends; not just on the strategy selected by a single player, but also on the strategies selected by all the other players. Furthermore, players have different preferences (get different payoffs) for each possible outcome. Therefore, any player, let us call her Ann, cannot select her best course of action without forming beliefs about the likely actions of the other players. And the same holds true for the other decision makers. In coming to her decision, Ann can emulate the other players’ decision making processes and can try to anticipate their strategy choices. However, to arrive at their best decisions the other players must also emulate Ann’s decision making process. Therefore, Ann must not only anticipate the other players’ choices but take into account that their choices will be conditioned by their anticipation of her choice. Strategic thinking is complex and quintessentially human.

The term strategy comes from the Greek word (strategos) for a military commander or general. A Greek military campaign (fought by coalitions of allied contingents) provides a prototype example of a strategic situation and generalship provides a prototype example of the process of strategic decision making. In a war, the possible outcomes depend on the independent decisions of the allied and opposing forces and there are certainly conflicts of interests between opposing sides but there may also be conflicts of interest among allies. For example, in ancient combat it was common practice for allies to give one another hostages to ensure that they would fulfill their obligations, e.g. to raise a specific number of troops, bring them to fight at a particular place and time, and not to make a separate peace with the enemy.

The wisdom of hostage exchange testifies to the seriousness of conflicts of interests even among allies. Another common practice in classical combat illustrates a related idea. Most battles ended with a truce to bury the dead. It was critically important to both sides that their dead have a proper burial and it was in their interests of the victor to grant the loser a burial truce in order to ensure reciprocity in a future battle in which they might be the loser.

The general point is that in strategic interactions, most of the players have both cooperative and conflicting interests with most of the other players. We must always cooperate with some people to compete with others. There are typically conflicts of interest even among allies and common interests even among adversaries.

Strategic situations are interesting because interdependent decisions can lead to complex and sometimes surprising strategic interaction effects. Here are some examples.

- OligopolyGame theory provides an alternative to the “free market” view of economics. In a competitive market, there are many suppliers, none of them has any pricing power, and they all make their production, pricing and investment decisions independently. Strategic thinking does not come into play. However, in an oligopoly, there are few suppliers; they have some pricing power, but in making their decisions, they must consider the possible responses (strategies) of the other firms. In oligopolies strategic thinking is critical and most real markets are oligopolies.
- Commitment and DeterrenceParadoxically, in an oligopoly a firm can benefit by making a commitment that reduces its freedom of action, i.e. the choices available to it later in the game. This can happen because the firm’s commitment can change his opponents’ beliefs and their subsequent behavior in a way that benefits the firm making the commitment. Consider a commitment on the part of Tata to build an obviously expensive new mini-compact car plant. The commitment reduces Tata’s options, but it can also deter other car makers, e.g., Chrysler, from entering the mini-compact car market. Chrysler might calculate that if it too entered the market, the market would then have profit destroying over capacity. However, Chrysler’s decision benefits Tata because it allows Tata to have the mini-compact market to itself and to produce the monopoly quantity and charge the monopoly price.The logic of commitments and deterrence applies not just to oligopolies but to all forms of strategic interaction where communication can influence the players’ beliefs about other players’ strategy choices.
- Strategic VotingIn truthful voting, you simply vote for the candidate or bill that you like best. In strategic voting, you consider the possible actions of other voters. For example, you might cross over and vote for the weakest opposition candidate in the primary, because you think that other voters will be more likely to vote against him in the general election.
- TrustIn strategic interactions where players can do well by cooperating, can do best by cheating, but fair worst by being cheated, rational players may miss the cooperative opportunity. They will instead choose to cheat or choose a defensive strategy that protects them from being cheated. As the example of the Prisoners’ Dilemma (discussed below) makes clear, beneficial cooperation is not automatic, even if it is best for everyone in the long run.

### The Framework

Game theorists have developed a framework for constructing models (games) that represent strategic interactions. The framework calls for models (games) to have 3 components: Players, Strategies, and Preferences.

#### Players

In the game theory model, players behave like robots, but very smart robots. Their actions are rational in the limited sense that they behave so as to promote their preferences. In particular, they will attempt to select a strategy that results in their most preferred outcome. Furthermore, the framework assumes that players can do strategic thinking, i.e. they can form a belief about the likely actions of the other players, and as part of that process players can emulate the strategic thinking of other players.

#### Strategies

Each player is assumed to have at his disposal a list of strategies. A strategy is the robot players program for playing the game. Each strategy must give a complete prescription for every logically possible combination of actions at each logically possible decision point in the game. Note that if there are sub-games within the game. The strategy for the game also prescribes a strategy for each sub-game.

When each player has picked his strategy the actual play of the game is completely predictable, because each robot player will act out his program. Game theorists call a list of strategies, one for each player, a strategy profile. Each strategy profile defines a possible outcome for the game. For example if there are 2 players and each player has 2 strategies, there are 4 possible strategy profiles and 4 possible outcomes. It there are 3 players and each player has 3 strategies, there are 27 possible outcomes, etc.

When one player changes his strategy it can change the outcome of the game for all the other players. In practice, it may turn out that different strategy profiles lead to the same outcomes, but that is a coincidence of the particular game.

#### Preferences (or Payoffs)

For a given strategy profile (or possible play of the game), all the players will experience the same outcome. However, they may have different preferences for that outcome relative to other possible outcomes (and strategy profiles) and these different preferences may introduce conflicts or possibilities for cooperation.

The framework assumes that each player can express his preferences for various possible outcomes on a numerical scale, similar to the numerical ratings that we assign to temperature. That is, the intervals make sense but the zero point and the size of the unit are arbitrary. The rating is called a payoff.

The framework does not assume that different players use the same preference measuring scale. Preferences come into play only to guide the strategy selections of each individual player. Specifically, game theory does not assume that all players can always express their preferences in terms of money.

###### Money

It is important to note that money is an outcome not a payoff (or preference). Money is a poor gauge of preferences, for several reasons. Money is of derivative not fundamental importance. Money has value only if the state has stability and sound monetary policy. In addition, people’s preference for a particular quantity of money depends profoundly on the amount of money that they have. Most importantly, money has not been around long enough for it to be a significant factor in evolution. Money is not connected directly with fundamental human motivations, which are programmed by evolution. Therefore, one finds that people will readily spend money to acquire things of no practical (or utilitarian) value but that are central to fundamental motivations, especially status seeking, and lust. People spend vast sums acquiring luxury goods, status symbols, whose primary purpose is display, and to take another example of status seeking, a successful businessman will spend millions of his own money to run for the U.S. Senate. Lust is the stuff of comedy. People spend vast amounts of money on preparations for sex (clothing, gym memberships, dating) on investing in relationships whose primary purpose is sex, on sex for pay, and on pornography.

### Simplifying Assumptions

Game theorists make some simplifying assumptions in order to be able to draw logical and mathematical conclusions about games. These assumptions make the game models easier to manipulate but less realistic. Here are some of the most important assumptions.

#### Rationality

Players are assumed to be perfectly rational (robot like), but only in a very limited sense that they have a consistent set of preferences and they always make decisions to optimize those preferences. Again, we have to emphasize that in the model rationality is a matter of consistency, not a matter of wisdom. A player can be rational and foolish. Players’ means are rational but their goals (preferences) are not rational in a utilitarian sense. Their preferences include the Seven Deadly Sins and the whole catalog of human folly.

#### Common Knowledge

Players are assumed to share some common knowledge about other players. In particular:

- Players are assumed to know other players’ available strategies and their preferences for various outcomes.
- All the players can rely on all the other players being rational.
- They all know that they all know the above two things, and they all know that they know that they know, etc.

### Applying Game Theory

Game Theory tells us how two rational robots would play in some model strategic situation. If people do not play as robots would play in an experimental or historical setting, it may be because (A) the actual situation may be more complex (in some important way) or (B) the players may lack common knowledge or (C) the actual players may be less capable of rational thought than robots. For example, players may not be individuals but organizations, and may therefore not have perfect recall, or the human players may lack the ability to revise probabilities properly. The point is that if a particular model fails, it tells us something interesting either about the complexity of the situation or about the limitations of human nature.

Even with its possible limitations, the game theory approach is a useful starting point for the analysis of real strategic situations. There are three critical questions.

- Who are the players?
- What are the player’s strategy choices and the possible outcomes?
- What are the players’ preferences for those outcomes?

### Strategic Thinking (Solving a Game)

The essence of strategic thinking is to select your strategy by considering the thinking process by which other players select their strategy. This is a two step process.

- Form a belief (probability) about the strategies that the other players will select.
- Select the optimal strategy based on that belief.

Let us consider an example, suppose the strategic situation is as follows. There are two players Moe and Joe. Moe has invited Joe for dinner with the intention of poisoning him. Joe suspects Moe’s intention and intends to turn the tables on him.

Moe offers Joe a drink. Joe is on his guard, but accepts. In a gesture worthy of a Shakespeare play, Joe brings out two jewel encrusted goblets, one of which contains an undetectable but lethal quantity of poison. Moe offers one goblet to Joe and keeps the other one. Now at the critical Shakespearian moment, Moe turns his head, which gives Joe an opportunity to switch the goblets.

This being a game theory example, where the players have common knowledge, Moe knows that Joe may switch the goblets, and Joe knows that Moe knows, etc.

Moe’s strategy choices are to put the poison in the goblet that he offers (O), or in the one that he keeps (K). Joe’s strategy choices are to switch the goblets (S) or not (N).

The following table shows the possible outcomes of the game and the corresponding strategy profile that would bring that profile about SO, SK, NO, NK.

Moe/Joe | S | N |

O | OS | ON |

K | KS | KN |

The four cells have the following meaning in our narrative.

OS | Moe poisons the goblet that he offers but Joe, forming the correct belief about Moe’s strategy choice, switches the goblets. Moe dies. |

ON | Moe poisons the goblet that he offers but believing that Joe would never be so stupid as to offer him the poisoned goblet, drinks it. Joe dies. |

KS | Moe poisons the goblet that he keeps, but Joe forming an incorrect belief about Moe’s strategy choice (thinking that Joe has offered him the poisoned goblet) switches the goblets. Joe dies. |

KO | Moe poisons the goblet that he keeps, and Joe forming a correct belief about Moe’s strategy choice drinks it with no ill effects. Moe dies. |

Let us assume that each player prefers to live rather than die, so that we can assign a payoff of 1 to an outcome where the player lives and a payoff of 0 to an outcome where the player dies. Suppose that we write the payoffs for Moe and Joe as an ordered pare of numbers, where we give Moe’s (the row player’s) payoff first. Then we can represent the entire game (players, strategies, and payoffs) in the following tabular form, which is called the matrix representation of the game.

Moe/Joe | S | N |

O | 0,1 | 1,0 |

K | 1,0 | 0,1 |

For reasons that will become clear presently, it is easier to start our explanation of the process of strategic thinking or solving a game with the second step in the process.

#### Step 2: Optimizing

Given a matrix representation of a game and the results of Step 1, Step 2 is easy. Once you have settled on your best guess as to what strategies the other players will choose, you know the payoffs that will follow from each of your strategy choices. That is, the strategy profile is determined, except for your choice of a strategy. You simply select the option that gives you the highest payoff (i.e. has the highest preference value).

The table shows that if Moe believes that Joe will play his S strategy, Moe should select his K strategy, which will give Moe his most preferred payoff of 1 (life) and will give Joe his least favorite payoff of 0 (death). If Moe thinks that Joe will choose N, Moe should choose 0. Joe can reason in an analogous way. Thus once Moe has done step 1 (i.e. formed a belief about Joe’s strategy) his choice, in step 2 is easy.

#### Step 1: Forming a Belief about the other players’ strategy choices

Forming a correct belief about the other players’ strategy choices, is the crux of the matter. We can attempt Step 1 by emulating the thinking of the other players. That is, we consider what-if scenarios about the other players’ strategy choices and then consider for each possibility what strategy Step 2 would yield. However, as we will see below, the what-if process does not always lead to a simple answer. The third assumption of common knowledge can lead to an infinite regress.

For example, suppose that Joe tries to come to a belief about Moe’s strategy choice by emulating Moe’s reasoning. Joe starts by considering the following scenario; let us call it Scenario 1.

Scenario 1: Perhaps Moe has selected his O strategy. Therefore, I should select S. However, Moe can emulate my thinking as well. I should consider a second scenario.

Scenario 2: Moe thinks that I (Joe) reasoned through Scenario 1. Consequently, Moe thinks I will choose S. Therefore Moe should choose K. But if Moe chooses K, I should choose N. However, I know that Moe can reason through Scenario 2. Therefore I should consider a third scenario.

Scenario 3: Moe thinks that I (Joe) reasoned through Scenario 2. Consequently, Moe thinks I will choose N. Therefore Moe should choose O and I should choose S.

Unfortunately, Joe is now back where he started (with the strategy profile OS). He can now continue forming scenarios, but after Scenarios 4, and 5, Joe will again be back where he started, at strategy profile OS, and this process will go on indefinitely.

As this example shows, forming a belief about other players’ strategy choices by emulating their thought process does not work all the time. We do not always come up with a single strategy answer for each player. Sometimes the best we can do is say that a player is likely to choose one of several strategies. Therefore, in general, we have to consider beliefs to be a statement about the probability that the other players will make certain strategy selections. In particular, game theorists consider a belief to be a probability distribution over the other players’ strategies. Such a probability distribution is just a table that lists all the other player’s strategies and assigns to each one a probability number (between 0 and 1) but the list has the special property that the sum of all the probabilities is 1.

When a player must form a probabilistic belief about what strategies the other players will choose, he may want to hedge his own bets. He may want to play what is called a mixed strategy. In a mixed strategy a player chooses his strategy according to a probability distribution over his own strategies.

Now, in ordinary life people do not really assign probabilities and are not particularly good at it when they do. If this were the end of the story, game theory would not be very interesting as a tool for analyzing everyday political and economic situations. However, there are special games for which one can arrive at a belief about the other players’ individual strategy selections. These special cases are fascinating models of real situations. The key point is that these special strategic situations have been around for all of evolutionary time and it is reasonable to suppose that social animals, including humans, have adaptations to help them recognize and cope with them.

These cases are also so well studied that they have well known names. In what follows, we will consider the best known, most extensively studied, and probably the most important special case, which is known as The Prisoners’ Dilemma.

### The Prisoners’ Dilemma

The Prisoners’ Dilemma is a model of a situation in which there are two players: Player 1 (P1) and Player 2 (P2) who have an opportunity for beneficial cooperation. In other words, the players have made (or could make) a bargain that generates some surplus value in which both parties can share. In the Prisoners’ Dilemma game both players must simultaneously (independently) choose one of two strategies: to fulfill the terms of the agreement (F) or to default (D). The defining feature of a Prisoners’ Dilemma situation is that payoffs (players’ preferences) are distributed over the outcomes so that the following conditions hold.

- Cheating is best. That is, the strategy profile where the other player fulfills and you default is the preferred outcome for both players.
- Cooperating is good. That is, the strategy profile where you and the other player fulfill your agreement is preferable to the other outcomes, except the one where the other player fulfills and you default.
- Failing to cooperate is bad. That is, both you and the other player prefer the strategy profile where you cooperate to the one where you both default.
- Being played for a sucker is worst. That is, the strategy profile where you fulfill and the other player defaults is the least preferred outcome for both players.

Prisoners’ Dilemma games are interesting because (as we will see below) rational players will miss the opportunity to get the good outcome and choose strategies that lead to the bad outcome.

Prisoners’ Dilemma situations are ubiquitous and the model can apply to most situations where there is an explicit or implicit bargain (or expectation of reciprocity) between two people. For example, any exchange-of-favors bargain meets the conditions to be a Prisoners’ Dilemma. Suppose that both players have an agreement to invest $2 to do the other player a favor that is worth $3 to him, but neither party can verify that the other party has fulfilled their agreement. The inability to verify is critical, because in the model we assume that both players choose their strategy simultaneously, i.e. without knowing the other player’s strategy choice.

In this example, it makes sense to assume that the player’s preferences are the same as the monetary outcomes. We can then represent the game’s players, strategies, and payoffs in the form of a simple table as follows.

P1/P2 | F | D |

F | 1, 1 | -2, 3 |

D | 3, -2 | 0, 0 |

The rows of the table represent P1’s strategies. The columns represent P2’s strategy choices. In the cells of the table, one always shows the row player’s (P1’s) payoff first and the column player’s (P2’s) payoff second.

- From P1’s point of view the best outcome is DF. If P1 (row) chooses strategy D and P2 (column) chooses the F strategy, P1 has cheated and played P2 for a sucker. If P2 fulfills the agreement and invests $2, but P1, defaults on the agreement and invests nothing, P2 is out $2 and P1 still receives the favor worth $3. The cell for strategy profile DF contains the numbers 3 and -2 indicating that the payoffs for P1 and P2 are 3 and -2 respectively.
- The good outcome follows if P1 and P2 both select their fulfillment strategy, F. They both spend $2 and get $3. The strategy profile FF gives both parties a net benefit of $1, i.e. the cell has payoff numbers 1, 1.
- The bad outcome follows if P1 and P2 both choose to default on their agreement; they invest nothing and gain nothing. The strategy profile DD, yields no added value for either player or the payoff numbers are 0, 0.
- From P1’s point of view, the worst case follows from the strategy profile FD, where P2 has cheated and played P1 for a sucker. Put another way, P2 is a free-rider on P1’s investment.

#### How it Works

In the Prisoners’ Dilemma, Step 1 (forming a belief about the other player’s strategy selection) is unusually easy. To see why, consider the game first from P1’s perspective. P1 can reason as follows.

“Suppose P2 thinks I will choose F, so that we are operating in row F. Then (in Step 2) P2 would want to choose D (i.e. he should cheat) so that he gets 3 rather than 1.”

“Alternatively, if P2 thinks I will choose D (so that we are operating in Row D) he should choose D, so that he avoids -2 (being played for a sucker).”

Consequently, P1 should believe that P2 will choose D no matter what P1 does.

P2 can reason in an analogous manor to conclude that P1 will choose D no matter what P2 does. When one strategy (such as F) is never the best, no matter what the other player does, Game Theorists call it a dominated strategy. In this case, D dominates F. Each player can form a belief about what the other player will do by emulating his thought process. In the case of the Prisoners’ Dilemma the process is easy. Each player can reason that the other player, being rational will not play his dominated strategy.

Given that in the Prisoners’ Dilemma Step 1 is easy, Step 2 leads to an immediate and easy solution for the whole game. Both parties, knowing that the other party will choose D, should also choose D.

#### Features

The Prisoners’ Dilemma has interesting, surprising, and frustrating features.

Because cheating pays better than cooperating, each player has an incentive (temptation) to cheat. The strategy of defaulting (D) is defensive in that it keeps you from being cheated. The fulfillment strategy (F) is risky, because it exposes you to the possibility of being cheated. The risk is worth taking only if you believe that the other player will be irrationally trusting. If all players were irrationally trusting (like family members, or religious zealots) they would be better off.

Both players being rational will play D, but the strategy profile DD is worse than the profile FF. Furthermore it is worse in a very specific way. FF is more Pareto efficient than DD. Game theorists and economists call an outcome more Pareto efficient than another if switching to that outcome would make one player better off without making any other player worse off. In this case, switching to FF makes everyone better off (i.e. both players get the good outcome rather than the bad one).

The situation is a dilemma because the players are forced to choose between fulfilling their agreement (which requires trust) and a more tempting strategy of defaulting.

No one would play a Prisoners’ Dilemma game unless forced to do so. Because both players can reason from the way the payoffs are distributed that they will both default on their agreement. Hence there is no reason for them to make such a bargain in the first place. That is, the players would play the game only if they were in some sense prisoners who were force to play. In most real cases, the players would simply fail to reach an agreement that would benefit them both.

Communication does not help to resolve the Prisoners’ Dilemma so that players can get the good outcome. The central issue is one of trust and no promise can make the parties believe one another because their promises would not be credible.

In a Prisoners’ Dilemma situation you can avoid the bad (inefficient) outcome (D, D) or no deal by the following game-changing measures:

- Repeated play, which opens the possibility of future payoffs and negative retaliation for cheating (i.e. the other party may stop cooperating).
- Enforceable Contracts (assuming that the provisions are verifiable and that there is an enforcement mechanism, such as a government with a court system) contracts can decrease the payoff for cheating.
- A credible threat of positive retaliation (i.e., the other party may be able to beat you senseless). This also decreases the payoff for cheating, but does not require a third party.