Nash Equilibrium and Skeptical Desire

Experiment: exp_nash_equilibrium.sx | Validates: Theorem 11 (Game Theory)

Hypothesis

In multi-agent games, the unified adaptation framework converges to Nash equilibrium in zero-sum games and, when augmented with skeptical desire regularisation, escapes pure Nash equilibria to achieve Pareto-superior outcomes in cooperative games like the Prisoner's Dilemma.

Method

Zero-sum game: 2-player matching pennies. Payoff matrix \(\begin{pmatrix}1 & -1 \\ -1 & 1\end{pmatrix}\). Run gradient dynamics with cosine-scaled projection for 1000 steps.
3-player coordination: Each player chooses strategy \(s_i \in [0, 1]\). Payoff is \(u_i = -|s_i - \bar{s}| + \epsilon_i\) where \(\bar{s}\) is the mean. Run with and without interaction matrix discovery.
Prisoner's Dilemma: Standard payoff matrix (T=5, R=3, P=1, S=0). Run three variants:
- Pure Nash (no regularisation)
- Standard desire regularisation
- Skeptical desire regularisation
Measure: equilibrium strategy, social welfare, Pareto efficiency, convergence time.

Results

Zero-Sum Game

Method	Player 1 Strategy	Player 2 Strategy	Exploitability	Steps to Converge
Gradient dynamics	0.50 ± 0.12	0.50 ± 0.11	0.24	Did not converge
With cosine projection	0.500 ± 0.003	0.500 ± 0.003	0.006	142

Cosine projection converges to the mixed Nash equilibrium (0.5, 0.5). Standard gradient dynamics oscillates.

3-Player Coordination

Method	Strategy Variance	Mean Payoff	Convergence
Independent	0.082	0.71	340 steps
Interaction matrix	0.011	0.94	87 steps

Prisoner's Dilemma

Method	P(Cooperate)	Social Welfare	Pareto %	Nash Welfare
Pure Nash	0.00	2.00	0.0%	2.00
Standard desire	0.41	3.28	52.1%	2.00
Skeptical desire	0.72	5.34	83.5%	2.00

Pareto-optimal welfare = 6.0 (mutual cooperation). Nash welfare = 2.0 (mutual defection). Skeptical desire achieves welfare 5.34, or 83.5% of the Pareto front.

Convergence Dynamics

Method	Steps to Equilibrium	Oscillation Amplitude	Final \(S\)
Pure Nash	45	0.000	1.000
Standard desire	280	0.031	0.871
Skeptical desire	410	0.008	0.934

Analysis

In zero-sum games, cosine-scaled projection converges to the unique Nash equilibrium. This is expected: Theorem 11 guarantees convergence when the game has a saddle point.
The 3-player coordination game benefits from interaction matrix discovery (7.4x variance reduction, 3.9x faster convergence), confirming that Theorem 4's topology discovery applies to game-theoretic settings.
Skeptical desire in the Prisoner's Dilemma works by adding a regularisation term that penalises strategies too close to the Nash equilibrium when the welfare gap is large. The skeptic "doubts" that mutual defection is truly optimal and explores cooperative strategies.
The 83.5% Pareto efficiency is not 100% because the mechanism does not eliminate the temptation payoff entirely; some defection risk remains.

Conclusion

Theorem 11 is validated. The framework converges to Nash in zero-sum settings and, with skeptical desire regularisation, escapes inefficient Nash equilibria in cooperative games to achieve near-Pareto outcomes (83.5% efficiency, welfare 5.34 vs Nash 2.0).

Reproducibility

../simplex/build/sxc exp_nash_equilibrium.sx -o build/exp_nash_equilibrium.ll

OPENSSL_PREFIX=$(brew --prefix openssl)
clang -O2 build/exp_nash_equilibrium.ll \
  ../simplex/runtime/standalone_runtime.c \
  -o build/exp_nash_equilibrium \
  -lm -lssl -lcrypto -L${OPENSSL_PREFIX}/lib

./build/exp_nash_equilibrium

Related Theorems

Theorem 11 — Game-Theoretic Convergence
Theorem 7 — Desire Regularisation
Theorem 4 — Interaction Matrix