← All Experiments

Experiment: Skeptical Annealing

Hypothesis

Conjecture 6.3 predicted that skepticism (misaligned desire) is most beneficial early in learning and should be annealed over time. The alternative hypothesis (Conjecture 6.5) predicts that sustained skepticism acts as a permanent regulariser. This experiment tests both predictions by comparing three strategies: annealed skepticism, fixed skepticism, and fixed alignment.

Method

Three-agent ensemble observing a stationary Bernoulli stream (\(p = 0.7\)) for up to 200 observations. Agent configurations:

  • Fixed-skeptical: desire coupling \(c = -0.5\) throughout
  • Fixed-aligned: desire coupling \(c = +0.5\) throughout
  • Annealed: starts at \(c = -0.5\), linearly anneals to \(c = +0.5\) over the observation window

Ensemble combines all three via learned weights. Calibration measured as squared Brier score at multiple horizons.

Experiment 1: Strategy Comparison

Head-to-head comparison of all three strategies at the full 200-observation horizon.

Strategy Final Calibration (Brier) Rank
Annealed (\(c: -0.5 \to +0.5\)) 0.00146 1st (best)
Fixed-skeptical (\(c = -0.5\)) 0.00163 2nd
Fixed-aligned (\(c = +0.5\)) 0.00168 3rd (worst)

Result 1

The annealed strategy achieves the best overall calibration (0.00146), outperforming fixed-skeptical by 10.4% and fixed-aligned by 13.1%. However, the critical question is whether the skeptic loses its advantage at longer horizons (Conjecture 6.3) or maintains it (Conjecture 6.5).

Experiment 2: Skeptic Advantage Across Horizons

Comparing fixed-skeptical versus fixed-aligned at three horizons: 20, 80, and 200 observations.

Horizon Skeptical Brier Aligned Brier Skeptic Wins?
20 observations 0.00891 0.00947 Yes (5.9% better)
80 observations 0.00342 0.00371 Yes (7.8% better)
200 observations 0.00163 0.00168 Yes (3.0% better)

Conjecture 6.3: Refuted

The skeptic wins at all horizons — 20, 80, and 200 observations. Conjecture 6.3 predicted the skeptic would lose its advantage as \(n \to \infty\). Instead, the advantage narrows but never reverses. Skepticism is not a temporary exploration heuristic; it is a structural advantage.

Experiment 3: Ensemble Weight Distribution

The ensemble learns how to weight the three strategies via meta-gradient. If annealing were optimal, the ensemble should converge to weight 1.0 on the annealed agent.

Agent Learned Weight
Annealed 0.36
Fixed-skeptical 0.34
Fixed-aligned 0.30

Result 3

Ensemble weights are roughly equal (0.36 / 0.34 / 0.30), with a slight preference for the annealed and skeptical agents. The meta-gradient does not strongly favour any single strategy, suggesting that diversity itself — maintaining multiple viewpoints — is the primary source of ensemble calibration improvement.

Analysis

This experiment produced a refutation and a validation:

  • Conjecture 6.3 (Refuted): Skepticism was predicted to help only early. The data shows the skeptic wins at all tested horizons. The advantage narrows from 5.9% at \(n=20\) to 3.0% at \(n=200\), but never crosses zero. This is consistent with the regularisation interpretation: as \(n\) grows, the bias-variance tradeoff shifts but the variance reduction from skepticism always exceeds the bias it introduces.
  • Conjecture 6.5 (Validated): Sustained skepticism is permanently beneficial. The fixed-skeptical agent never falls behind the fixed-aligned agent at any horizon.

The annealed strategy wins overall not because annealing is optimal, but because it captures diversity — the agent passes through multiple coupling regimes, effectively averaging over them. This led directly to the proof of Theorem 7.

Conclusion

Conjecture 6.3 refuted: the skeptic wins always, not just early. Conjecture 6.5 validated: sustained skepticism is a permanent Bayesian regulariser. Annealed strategy achieves best single-agent calibration (0.00146) but ensemble weights suggest diversity matters more than any single schedule. These results strengthened the theory by motivating Theorem 7.

Reproducibility

# Clone and build
git clone https://github.com/senuamedia/lab.git
cd simplex && ./build.sh && cd ..

# Clone theorem-proof
git clone https://github.com/senuamedia/theorem-proof.git
cd theorem-proof

# Compile
../simplex/build/sxc exp_skeptical_annealing.sx -o build/exp_skeptical_annealing.ll

# Link with runtime
OPENSSL_PREFIX=$(brew --prefix openssl)
clang -O2 build/exp_skeptical_annealing.ll \
  ../simplex/runtime/standalone_runtime.c \
  -I"$OPENSSL_PREFIX/include" \
  -L"$OPENSSL_PREFIX/lib" \
  -lssl -lcrypto -lm \
  -o build/exp_skeptical_annealing

# Run
./build/exp_skeptical_annealing

Related Theorems