Back to Experiments

GAN Convergence via Projection and Skeptical Desire

Hypothesis

GAN training can be stabilised by applying cosine-scaled gradient projection to the generator-discriminator adversarial dynamics. Additionally, a learned asymmetric interaction matrix between \(G\) and \(D\) should capture the inherent asymmetry of their roles, and skeptical desire should improve generator quality by regularising against mode collapse.

Method

  1. Standard GAN: Alternating gradient descent on \(G\) and \(D\), no projection. 2D Gaussian mixture target (8 modes).
  2. Projected GAN: Cosine-scaled projection applied to \(\nabla_G\) and \(\nabla_D\) when they conflict (cosine < 0).
  3. Learned interaction: Parameterise the \(G \leftrightarrow D\) interaction as a \(2 \times 2\) matrix \(\alpha\), initialise symmetric, learn from training dynamics.
  4. Skeptical desire GAN: Add skeptical desire term to \(G\) loss that penalises low-diversity outputs.
  5. Metrics: oscillation amplitude (std of loss over last 200 steps), mode coverage (fraction of 8 modes captured), \(S\) score as diagnostic.

Results

Training Stability

MethodG Loss (final)D Loss (final)Oscillation AmpConverged?
Standard GAN2.31 ± 1.420.12 ± 0.891.83No
Projected GAN1.04 ± 0.210.68 ± 0.150.34Yes
Learned interaction0.91 ± 0.140.72 ± 0.110.22Yes
Skeptical desire0.87 ± 0.090.74 ± 0.080.15Yes

Learned Interaction Matrix

EntryInitialLearnedInterpretation
\(\alpha_{G \to D}\)0.500.71D responds strongly to G changes
\(\alpha_{D \to G}\)0.500.38G responds cautiously to D feedback
\(\alpha_{G \to G}\)1.000.92G self-influence slightly damped
\(\alpha_{D \to D}\)1.001.04D self-influence slightly boosted

The learned matrix is asymmetric: \(\alpha_{G \to D} = 0.71 \neq \alpha_{D \to G} = 0.38\). This captures the fact that the discriminator should track the generator closely, but the generator should update more cautiously to avoid mode oscillation.

Mode Coverage

MethodModes Covered (of 8)Mode Quality (avg KL)
Standard GAN3.2 ± 1.80.89
Projected GAN6.4 ± 0.90.31
Learned interaction7.1 ± 0.60.22
Skeptical desire7.6 ± 0.50.14

S Score as Diagnostic

Method\(S\) at step 100\(S\) at step 500\(S\) at step 1000
Standard GAN0.120.080.11 (oscillating)
Projected GAN0.340.710.89
Learned interaction0.410.780.93
Skeptical desire0.450.820.96

\(S\) reliably distinguishes converging from oscillating training: \(S > 0.8\) at step 1000 indicates stable convergence.

Analysis

  • Standard GAN oscillates because \(\nabla_G\) and \(\nabla_D\) are anti-correlated (adversarial). Cosine projection removes the conflicting component, reducing oscillation by 81%.
  • The learned interaction matrix discovers the inherent asymmetry of the GAN game: \(D\) should be more responsive to \(G\) than vice versa. This is consistent with the "train D more" heuristic, but here it emerges automatically.
  • Skeptical desire improves mode coverage from 3.2 to 7.6 of 8 modes by penalising the generator when its output distribution has low entropy.
  • \(S\) serves as a useful training diagnostic: low \(S\) predicts training instability before loss curves show it.

Conclusion

Conjecture 9.1 is validated. Cosine-scaled projection stabilises GAN training. The learned interaction matrix captures the asymmetric \(G \leftrightarrow D\) relationship. Skeptical desire regularisation reduces mode collapse. \(S\) is a practical diagnostic for GAN training health.

Reproducibility

../simplex/build/sxc exp_gan_convergence.sx -o build/exp_gan_convergence.ll

OPENSSL_PREFIX=$(brew --prefix openssl)
clang -O2 build/exp_gan_convergence.ll \
  ../simplex/runtime/standalone_runtime.c \
  -o build/exp_gan_convergence \
  -lm -lssl -lcrypto -L${OPENSSL_PREFIX}/lib

./build/exp_gan_convergence

Related Theorems