Repeated Games: How Cooperation Emerges from Self-Interest

In the Prisoner’s Dilemma, rational players defect. In the Tragedy of the Commons, rational actors destroy shared resources. One-shot game theory seems to paint a bleak picture: selfishness always wins.

But real life isn’t a series of one-shot games. We interact with the same people, companies, and countries repeatedly. And this changes everything.

Welcome to repeated games — where cooperation emerges not from altruism, but from enlightened self-interest.


The One-Shot Prisoner’s Dilemma (Recap)

Remember the classic setup:

Cooperate Defect
Cooperate -1, -1 -3, 0
Defect 0, -3 -2, -2

(Payoffs represent years in prison; lower is better)

Nash Equilibrium: (Defect, Defect) with payoffs (-2, -2)

Optimal outcome: (Cooperate, Cooperate) with payoffs (-1, -1)

In a one-shot game, defection is a dominant strategy. But what if you play this game 100 times with the same partner?


Enter the Repeated Game

A repeated game (also called a supergame) is when the same game is played multiple times by the same players.

Key features:

  1. Players know the history of previous plays
  2. Future payoffs matter (though possibly discounted)
  3. Strategies can be conditional on past behavior
  4. Reputation becomes valuable

The game tree looks like this:

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#fff','secondaryTextColor':'#fff','tertiaryTextColor':'#fff','textColor':'#fff','nodeTextColor':'#fff'}}}%% graph TD A[Round 1] --> B[History: outcome 1] B --> C[Round 2] C --> D[History: outcomes 1,2] D --> E[Round 3] E --> F[History: outcomes 1,2,3] F --> G[Round 4...] H[Actions depend on history] -.-> C H -.-> E H -.-> G I[Future matters] -.-> A I -.-> C I -.-> E style A fill:#2d3748,stroke:#4299e1,stroke-width:2px style I fill:#2d3748,stroke:#48bb78,stroke-width:3px

The Folk Theorem: Cooperation is Possible

The Folk Theorem is one of game theory’s most important results. It states:

In infinitely repeated games, any outcome that gives each player at least their “minmax” payoff can be sustained as a Nash equilibrium with appropriate strategies.

Translation: Almost any outcome — including full cooperation — can be an equilibrium if the game repeats enough and players are patient enough.

Why? Because the threat of future punishment makes defection unprofitable today.


The Strategy That Changes Everything: Tit-for-Tat

In the 1980s, political scientist Robert Axelrod held a tournament: submit a strategy for the repeated Prisoner’s Dilemma, and he’d run a round-robin competition to see which performs best.

The winner? Tit-for-Tat, submitted by Anatol Rapoport.

Tit-for-Tat Rules:

  1. Start by cooperating (be nice)
  2. Then do whatever your opponent did last round (reciprocate)

That’s it. Incredibly simple, yet devastatingly effective.

Why it works:

  • It’s nice: never defects first
  • It’s retaliatory: punishes defection immediately
  • It’s forgiving: returns to cooperation if opponent does
  • It’s clear: easy for opponents to understand and predict
%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#fff','secondaryTextColor':'#fff','tertiaryTextColor':'#fff','textColor':'#fff','nodeTextColor':'#fff'}}}%% stateDiagram-v2 [*] --> Cooperate: Start Cooperate --> Cooperate: Opponent cooperated Cooperate --> Defect: Opponent defected Defect --> Cooperate: Opponent cooperated Defect --> Defect: Opponent defected note right of Cooperate: Reward cooperation note right of Defect: Punish defection

Example: Tit-for-Tat vs Always Defect

Let’s simulate 5 rounds:

Round Player 1 (Tit-for-Tat) Player 2 (Always Defect) P1 Payoff P2 Payoff
1 Cooperate Defect -3 0
2 Defect Defect -2 -2
3 Defect Defect -2 -2
4 Defect Defect -2 -2
5 Defect Defect -2 -2

Total: Player 1: -11, Player 2: -8

Player 2 wins in the short term, but as the game continues, the advantage of that initial defection is washed out by mutual defection.


Example: Tit-for-Tat vs Tit-for-Tat

Now let’s see two cooperative strategies interact:

Round Player 1 (Tit-for-Tat) Player 2 (Tit-for-Tat) P1 Payoff P2 Payoff
1 Cooperate Cooperate -1 -1
2 Cooperate Cooperate -1 -1
3 Cooperate Cooperate -1 -1
4 Cooperate Cooperate -1 -1
5 Cooperate Cooperate -1 -1

Total: Player 1: -5, Player 2: -5

Both players achieve the cooperative optimum. This is the power of reciprocity.


The Mathematics: When Does Cooperation Pay?

Let’s formalize this. Suppose:

  • You cooperate each round and get payoff C
  • If you defect once, you get payoff D (where D > C)
  • But then your opponent retaliates, and you both defect forever, getting payoff P per round (where P < C < D)

Cooperating forever: Total payoff = C + C + C + … = C/(1-δ) (where δ is the discount factor: how much you value future payoffs)

Defecting once: Total payoff = D + P + P + … = D + Pδ/(1-δ)

Cooperation is better when:

C/(1-δ) ≥ D + Pδ/(1-δ)

Simplifying:

C ≥ D(1-δ) + Pδ

δ ≥ (D-C)/(D-P)

Translation: You need to care enough about the future (high δ) for cooperation to be worthwhile.


The Discount Factor: How Much Do You Care About Tomorrow?

The discount factor δ represents how much you value future payoffs relative to immediate payoffs.

  • δ = 0: Only care about today (no cooperation)
  • δ = 0.5: Tomorrow’s payoff is worth half of today’s
  • δ = 0.9: Tomorrow’s payoff is worth 90% of today’s
  • δ = 1: Future payoffs count equally (perfect patience)
%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#fff','secondaryTextColor':'#fff','tertiaryTextColor':'#fff','textColor':'#fff','nodeTextColor':'#fff'}}}%% graph LR A[Low δ < 0.3] --> B[Short-term thinking] B --> C[Defection dominates] D[Medium δ ≈ 0.5-0.7] --> E[Moderate patience] E --> F[Conditional cooperation] G[High δ > 0.9] --> H[Long-term thinking] H --> I[Full cooperation sustainable] style C fill:#742a2a,stroke:#f56565,stroke-width:2px style F fill:#744210,stroke:#ed8936,stroke-width:2px style I fill:#2d3748,stroke:#48bb78,stroke-width:2px

Real-world implications:

  • Dying industries: Low δ → expect less cooperation
  • Stable partnerships: High δ → cooperation thrives
  • International relations: High δ → treaties sustainable
  • Criminal underworld: Uncertain future → lower δ, but reputation effects can compensate

The Finite Horizon Problem

What if the game has a known end point? Say, exactly 10 rounds?

This creates a problem: backward induction (which we’ll explore in detail in the next post).

Logic:

  1. In round 10 (the last round), there’s no future, so defect
  2. Knowing round 10 ends in defection, there’s no reason to cooperate in round 9
  3. By backward induction, both players defect in every round

The paradox: The Nash equilibrium of a finitely repeated Prisoner’s Dilemma is mutual defection in every round — the same as the one-shot game!

In practice: People still cooperate in finite games, partly because:

  • They’re not perfectly rational
  • They might miscount or forget the exact end
  • Reputation matters beyond this specific game
  • They have social preferences (fairness, reciprocity)

Other Successful Strategies

While Tit-for-Tat is excellent, other strategies also perform well:

1. Generous Tit-for-Tat

Like Tit-for-Tat, but occasionally forgives defection (cooperates with some probability even after opponent defects).

Advantage: More robust to noise (accidental defections)

2. Tit-for-Two-Tats

Only retaliate after two consecutive defections.

Advantage: More forgiving, less prone to mutual punishment spirals

3. Pavlov (Win-Stay, Lose-Shift)

  • If you did well last round, repeat your action
  • If you did poorly, switch actions

Advantage: Can exploit unconditional cooperators while maintaining cooperation with reciprocators

4. Grim Trigger

Cooperate until opponent defects once, then defect forever.

Advantage: Creates strong deterrent against defection

Disadvantage: Unforgiving; one mistake ruins the relationship forever

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#fff','secondaryTextColor':'#fff','tertiaryTextColor':'#fff','textColor':'#fff','nodeTextColor':'#fff'}}}%% graph TD A[Cooperative Strategies] --> B[Tit-for-Tat] A --> C[Generous TFT] A --> D[Tit-for-Two-Tats] A --> E[Pavlov] A --> F[Grim Trigger] B --> G[Balance: Nice, Retaliatory, Forgiving] C --> H[More Forgiving] D --> I[Very Forgiving] E --> J[Exploitative but Stable] F --> K[Maximum Deterrence] style B fill:#2d3748,stroke:#4299e1,stroke-width:3px style A fill:#2d3748,stroke:#48bb78,stroke-width:2px

Real-World Applications

1. International Trade

Countries cooperate on trade deals not because they’re altruistic, but because:

  • They trade repeatedly
  • Defection (tariffs, breaking agreements) triggers retaliation
  • The shadow of the future creates incentives for cooperation

2. Business Relationships

Why do suppliers deliver quality products?

  • Repeat business is more valuable than one-time defection
  • Reputation matters (multi-lateral repeated games)
  • Long-term contracts implicitly enforce cooperation

3. Social Norms

Why do people follow unwritten rules?

  • We interact repeatedly in communities
  • Defection (rudeness, cheating) gets punished by others
  • Reputation systems enforce cooperation

4. Criminal Organizations

Even illegal groups sustain cooperation through:

  • Repeated interactions
  • Harsh punishments for defection (omertà in the Mafia)
  • High value placed on future interactions (δ)

The Evolutionary Perspective

Repeated games also explain how cooperation evolved biologically.

When organisms interact repeatedly:

  • Cooperators who use Tit-for-Tat outcompete unconditional defectors
  • Reciprocal altruism becomes evolutionarily stable
  • Examples: vampire bats sharing blood, cleaner fish relationships

Evolution doesn’t require conscious strategy — simple rules like “help those who help you” can emerge through natural selection.


Common Mistakes

Mistake 1: Being Too Nice

Always cooperating (even when exploited) doesn’t work. You need to retaliate to incentivize cooperation.

Mistake 2: Being Too Mean

Always defecting prevents the cooperative equilibrium from ever emerging. You need to reward cooperation.

Mistake 3: Being Unforgiving

Grim Trigger strategies lock you into mutual defection after one mistake. Forgiveness allows relationships to recover.

Mistake 4: Ignoring Noise

In real life, mistakes happen (miscommunication, accidents). Strategies need to be robust to noise.


Key Takeaways

  1. Repeated games fundamentally change incentives — cooperation can emerge from pure self-interest
  2. Tit-for-Tat is remarkably effective: nice, retaliatory, forgiving, clear
  3. The discount factor matters: you need to value the future enough (high δ)
  4. Known finite horizons undermine cooperation (backward induction)
  5. Real-world cooperation often relies on repeated interaction: trade, business, social norms
  6. Balance is key: be cooperative but not exploitable, retaliatory but forgiving

Practice Problem

You’re playing a repeated Prisoner’s Dilemma with payoffs:

  • Both cooperate: 3 each
  • You defect, they cooperate: 5 to you, 0 to them
  • Both defect: 1 each

You’re considering playing Tit-for-Tat. Your opponent might always defect or might also play Tit-for-Tat.

If your discount factor is δ = 0.9, should you play Tit-for-Tat?

Solution

Against Always Defect:

  • Tit-for-Tat: 0 (first round) + 1/(1-0.9) = 0 + 10 = 10
  • Always Defect: 1/(1-0.9) = 10
  • (No advantage either way)

Against Tit-for-Tat:

  • Tit-for-Tat vs TFT: 3/(1-0.9) = 30
  • Always Defect: 5 + 1·0.9/(1-0.9) = 5 + 9 = 14

Conclusion: Play Tit-for-Tat. Against TFT players, you do much better (30 vs 14). Against Always Defect, you break even. Since some players will be cooperative, TFT is optimal.

This shows how reciprocity pays when others are also reciprocators!


What’s Next?

Repeated games show how cooperation emerges when we look forward. But how do we solve games by looking backward? When you know the game ends in a specific way, you can work backwards to figure out the optimal play from the start.

Next, we’ll explore Backward Induction — the technique that lets you solve complex sequential games by reasoning from the end to the beginning.


This post is part of the Game Theory Series, where we explore the mathematics of strategic decision-making.