Repeated Games: How Cooperation Emerges from Self-Interest
In the Prisoner’s Dilemma, rational players defect. In the Tragedy of the Commons, rational actors destroy shared resources. One-shot game theory seems to paint a bleak picture: selfishness always wins.
But real life isn’t a series of one-shot games. We interact with the same people, companies, and countries repeatedly. And this changes everything.
Welcome to repeated games — where cooperation emerges not from altruism, but from enlightened self-interest.
The One-Shot Prisoner’s Dilemma (Recap)
Remember the classic setup:
| Cooperate | Defect | |
|---|---|---|
| Cooperate | -1, -1 | -3, 0 |
| Defect | 0, -3 | -2, -2 |
(Payoffs represent years in prison; lower is better)
Nash Equilibrium: (Defect, Defect) with payoffs (-2, -2)
Optimal outcome: (Cooperate, Cooperate) with payoffs (-1, -1)
In a one-shot game, defection is a dominant strategy. But what if you play this game 100 times with the same partner?
Enter the Repeated Game
A repeated game (also called a supergame) is when the same game is played multiple times by the same players.
Key features:
- Players know the history of previous plays
- Future payoffs matter (though possibly discounted)
- Strategies can be conditional on past behavior
- Reputation becomes valuable
The game tree looks like this:
The Folk Theorem: Cooperation is Possible
The Folk Theorem is one of game theory’s most important results. It states:
In infinitely repeated games, any outcome that gives each player at least their “minmax” payoff can be sustained as a Nash equilibrium with appropriate strategies.
Translation: Almost any outcome — including full cooperation — can be an equilibrium if the game repeats enough and players are patient enough.
Why? Because the threat of future punishment makes defection unprofitable today.
The Strategy That Changes Everything: Tit-for-Tat
In the 1980s, political scientist Robert Axelrod held a tournament: submit a strategy for the repeated Prisoner’s Dilemma, and he’d run a round-robin competition to see which performs best.
The winner? Tit-for-Tat, submitted by Anatol Rapoport.
Tit-for-Tat Rules:
- Start by cooperating (be nice)
- Then do whatever your opponent did last round (reciprocate)
That’s it. Incredibly simple, yet devastatingly effective.
Why it works:
- It’s nice: never defects first
- It’s retaliatory: punishes defection immediately
- It’s forgiving: returns to cooperation if opponent does
- It’s clear: easy for opponents to understand and predict
Example: Tit-for-Tat vs Always Defect
Let’s simulate 5 rounds:
| Round | Player 1 (Tit-for-Tat) | Player 2 (Always Defect) | P1 Payoff | P2 Payoff |
|---|---|---|---|---|
| 1 | Cooperate | Defect | -3 | 0 |
| 2 | Defect | Defect | -2 | -2 |
| 3 | Defect | Defect | -2 | -2 |
| 4 | Defect | Defect | -2 | -2 |
| 5 | Defect | Defect | -2 | -2 |
Total: Player 1: -11, Player 2: -8
Player 2 wins in the short term, but as the game continues, the advantage of that initial defection is washed out by mutual defection.
Example: Tit-for-Tat vs Tit-for-Tat
Now let’s see two cooperative strategies interact:
| Round | Player 1 (Tit-for-Tat) | Player 2 (Tit-for-Tat) | P1 Payoff | P2 Payoff |
|---|---|---|---|---|
| 1 | Cooperate | Cooperate | -1 | -1 |
| 2 | Cooperate | Cooperate | -1 | -1 |
| 3 | Cooperate | Cooperate | -1 | -1 |
| 4 | Cooperate | Cooperate | -1 | -1 |
| 5 | Cooperate | Cooperate | -1 | -1 |
Total: Player 1: -5, Player 2: -5
Both players achieve the cooperative optimum. This is the power of reciprocity.
The Mathematics: When Does Cooperation Pay?
Let’s formalize this. Suppose:
- You cooperate each round and get payoff C
- If you defect once, you get payoff D (where D > C)
- But then your opponent retaliates, and you both defect forever, getting payoff P per round (where P < C < D)
Cooperating forever: Total payoff = C + C + C + … = C/(1-δ) (where δ is the discount factor: how much you value future payoffs)
Defecting once: Total payoff = D + P + P + … = D + Pδ/(1-δ)
Cooperation is better when:
C/(1-δ) ≥ D + Pδ/(1-δ)
Simplifying:
C ≥ D(1-δ) + Pδ
δ ≥ (D-C)/(D-P)
Translation: You need to care enough about the future (high δ) for cooperation to be worthwhile.
The Discount Factor: How Much Do You Care About Tomorrow?
The discount factor δ represents how much you value future payoffs relative to immediate payoffs.
- δ = 0: Only care about today (no cooperation)
- δ = 0.5: Tomorrow’s payoff is worth half of today’s
- δ = 0.9: Tomorrow’s payoff is worth 90% of today’s
- δ = 1: Future payoffs count equally (perfect patience)
Real-world implications:
- Dying industries: Low δ → expect less cooperation
- Stable partnerships: High δ → cooperation thrives
- International relations: High δ → treaties sustainable
- Criminal underworld: Uncertain future → lower δ, but reputation effects can compensate
The Finite Horizon Problem
What if the game has a known end point? Say, exactly 10 rounds?
This creates a problem: backward induction (which we’ll explore in detail in the next post).
Logic:
- In round 10 (the last round), there’s no future, so defect
- Knowing round 10 ends in defection, there’s no reason to cooperate in round 9
- By backward induction, both players defect in every round
The paradox: The Nash equilibrium of a finitely repeated Prisoner’s Dilemma is mutual defection in every round — the same as the one-shot game!
In practice: People still cooperate in finite games, partly because:
- They’re not perfectly rational
- They might miscount or forget the exact end
- Reputation matters beyond this specific game
- They have social preferences (fairness, reciprocity)
Other Successful Strategies
While Tit-for-Tat is excellent, other strategies also perform well:
1. Generous Tit-for-Tat
Like Tit-for-Tat, but occasionally forgives defection (cooperates with some probability even after opponent defects).
Advantage: More robust to noise (accidental defections)
2. Tit-for-Two-Tats
Only retaliate after two consecutive defections.
Advantage: More forgiving, less prone to mutual punishment spirals
3. Pavlov (Win-Stay, Lose-Shift)
- If you did well last round, repeat your action
- If you did poorly, switch actions
Advantage: Can exploit unconditional cooperators while maintaining cooperation with reciprocators
4. Grim Trigger
Cooperate until opponent defects once, then defect forever.
Advantage: Creates strong deterrent against defection
Disadvantage: Unforgiving; one mistake ruins the relationship forever
Real-World Applications
1. International Trade
Countries cooperate on trade deals not because they’re altruistic, but because:
- They trade repeatedly
- Defection (tariffs, breaking agreements) triggers retaliation
- The shadow of the future creates incentives for cooperation
2. Business Relationships
Why do suppliers deliver quality products?
- Repeat business is more valuable than one-time defection
- Reputation matters (multi-lateral repeated games)
- Long-term contracts implicitly enforce cooperation
3. Social Norms
Why do people follow unwritten rules?
- We interact repeatedly in communities
- Defection (rudeness, cheating) gets punished by others
- Reputation systems enforce cooperation
4. Criminal Organizations
Even illegal groups sustain cooperation through:
- Repeated interactions
- Harsh punishments for defection (omertà in the Mafia)
- High value placed on future interactions (δ)
The Evolutionary Perspective
Repeated games also explain how cooperation evolved biologically.
When organisms interact repeatedly:
- Cooperators who use Tit-for-Tat outcompete unconditional defectors
- Reciprocal altruism becomes evolutionarily stable
- Examples: vampire bats sharing blood, cleaner fish relationships
Evolution doesn’t require conscious strategy — simple rules like “help those who help you” can emerge through natural selection.
Common Mistakes
Mistake 1: Being Too Nice
Always cooperating (even when exploited) doesn’t work. You need to retaliate to incentivize cooperation.
Mistake 2: Being Too Mean
Always defecting prevents the cooperative equilibrium from ever emerging. You need to reward cooperation.
Mistake 3: Being Unforgiving
Grim Trigger strategies lock you into mutual defection after one mistake. Forgiveness allows relationships to recover.
Mistake 4: Ignoring Noise
In real life, mistakes happen (miscommunication, accidents). Strategies need to be robust to noise.
Key Takeaways
- Repeated games fundamentally change incentives — cooperation can emerge from pure self-interest
- Tit-for-Tat is remarkably effective: nice, retaliatory, forgiving, clear
- The discount factor matters: you need to value the future enough (high δ)
- Known finite horizons undermine cooperation (backward induction)
- Real-world cooperation often relies on repeated interaction: trade, business, social norms
- Balance is key: be cooperative but not exploitable, retaliatory but forgiving
Practice Problem
You’re playing a repeated Prisoner’s Dilemma with payoffs:
- Both cooperate: 3 each
- You defect, they cooperate: 5 to you, 0 to them
- Both defect: 1 each
You’re considering playing Tit-for-Tat. Your opponent might always defect or might also play Tit-for-Tat.
If your discount factor is δ = 0.9, should you play Tit-for-Tat?
Solution
Against Always Defect:
- Tit-for-Tat: 0 (first round) + 1/(1-0.9) = 0 + 10 = 10
- Always Defect: 1/(1-0.9) = 10
- (No advantage either way)
Against Tit-for-Tat:
- Tit-for-Tat vs TFT: 3/(1-0.9) = 30
- Always Defect: 5 + 1·0.9/(1-0.9) = 5 + 9 = 14
Conclusion: Play Tit-for-Tat. Against TFT players, you do much better (30 vs 14). Against Always Defect, you break even. Since some players will be cooperative, TFT is optimal.
This shows how reciprocity pays when others are also reciprocators!
What’s Next?
Repeated games show how cooperation emerges when we look forward. But how do we solve games by looking backward? When you know the game ends in a specific way, you can work backwards to figure out the optimal play from the start.
Next, we’ll explore Backward Induction — the technique that lets you solve complex sequential games by reasoning from the end to the beginning.
This post is part of the Game Theory Series, where we explore the mathematics of strategic decision-making.