Nash Equilibrium and Skeptical Desire
Hypothesis
In multi-agent games, the unified adaptation framework converges to Nash equilibrium in zero-sum games and, when augmented with skeptical desire regularisation, escapes pure Nash equilibria to achieve Pareto-superior outcomes in cooperative games like the Prisoner's Dilemma.
Method
- Zero-sum game: 2-player matching pennies. Payoff matrix \(\begin{pmatrix}1 & -1 \\ -1 & 1\end{pmatrix}\). Run gradient dynamics with cosine-scaled projection for 1000 steps.
- 3-player coordination: Each player chooses strategy \(s_i \in [0, 1]\). Payoff is \(u_i = -|s_i - \bar{s}| + \epsilon_i\) where \(\bar{s}\) is the mean. Run with and without interaction matrix discovery.
- Prisoner's Dilemma: Standard payoff matrix (T=5, R=3, P=1, S=0). Run three variants:
- Pure Nash (no regularisation)
- Standard desire regularisation
- Skeptical desire regularisation
- Measure: equilibrium strategy, social welfare, Pareto efficiency, convergence time.
Results
Zero-Sum Game
| Method | Player 1 Strategy | Player 2 Strategy | Exploitability | Steps to Converge |
|---|---|---|---|---|
| Gradient dynamics | 0.50 ± 0.12 | 0.50 ± 0.11 | 0.24 | Did not converge |
| With cosine projection | 0.500 ± 0.003 | 0.500 ± 0.003 | 0.006 | 142 |
Cosine projection converges to the mixed Nash equilibrium (0.5, 0.5). Standard gradient dynamics oscillates.
3-Player Coordination
| Method | Strategy Variance | Mean Payoff | Convergence |
|---|---|---|---|
| Independent | 0.082 | 0.71 | 340 steps |
| Interaction matrix | 0.011 | 0.94 | 87 steps |
Prisoner's Dilemma
| Method | P(Cooperate) | Social Welfare | Pareto % | Nash Welfare |
|---|---|---|---|---|
| Pure Nash | 0.00 | 2.00 | 0.0% | 2.00 |
| Standard desire | 0.41 | 3.28 | 52.1% | 2.00 |
| Skeptical desire | 0.72 | 5.34 | 83.5% | 2.00 |
Pareto-optimal welfare = 6.0 (mutual cooperation). Nash welfare = 2.0 (mutual defection). Skeptical desire achieves welfare 5.34, or 83.5% of the Pareto front.
Convergence Dynamics
| Method | Steps to Equilibrium | Oscillation Amplitude | Final \(S\) |
|---|---|---|---|
| Pure Nash | 45 | 0.000 | 1.000 |
| Standard desire | 280 | 0.031 | 0.871 |
| Skeptical desire | 410 | 0.008 | 0.934 |
Analysis
- In zero-sum games, cosine-scaled projection converges to the unique Nash equilibrium. This is expected: Theorem 11 guarantees convergence when the game has a saddle point.
- The 3-player coordination game benefits from interaction matrix discovery (7.4x variance reduction, 3.9x faster convergence), confirming that Theorem 4's topology discovery applies to game-theoretic settings.
- Skeptical desire in the Prisoner's Dilemma works by adding a regularisation term that penalises strategies too close to the Nash equilibrium when the welfare gap is large. The skeptic "doubts" that mutual defection is truly optimal and explores cooperative strategies.
- The 83.5% Pareto efficiency is not 100% because the mechanism does not eliminate the temptation payoff entirely; some defection risk remains.
Conclusion
Theorem 11 is validated. The framework converges to Nash in zero-sum settings and, with skeptical desire regularisation, escapes inefficient Nash equilibria in cooperative games to achieve near-Pareto outcomes (83.5% efficiency, welfare 5.34 vs Nash 2.0).
Reproducibility
../simplex/build/sxc exp_nash_equilibrium.sx -o build/exp_nash_equilibrium.ll
OPENSSL_PREFIX=$(brew --prefix openssl)
clang -O2 build/exp_nash_equilibrium.ll \
../simplex/runtime/standalone_runtime.c \
-o build/exp_nash_equilibrium \
-lm -lssl -lcrypto -L${OPENSSL_PREFIX}/lib
./build/exp_nash_equilibrium
Related Theorems
- Theorem 11 — Game-Theoretic Convergence
- Theorem 7 — Desire Regularisation
- Theorem 4 — Interaction Matrix