Back to Experiments

S as Adaptive Control Signal

Hypothesis

The convergence score \(S\) can serve as a real-time control signal for adaptive learning rate schedules. On smooth landscapes, \(S\)-controlled learning rate should match the optimal fixed rate. On multi-modal or regime-shifting landscapes, \(S\) should automatically reduce the learning rate near minima and increase it in flat regions, outperforming fixed schedules.

Method

  1. Quadratic baseline. Minimise \(f(x) = x^2\) with S-controlled learning rate \(\eta_t = \eta_0 \cdot \max(S_t, 0.01)\). Compare final loss to optimal fixed \(\eta\).
  2. Rastrigin landscape. Minimise the Rastrigin function \(f(x) = 10n + \sum_i [x_i^2 - 10\cos(2\pi x_i)]\). Track \(S\) near local minima and observe adaptive rate behaviour.
  3. Belief regime change. Run a belief-updating agent where the data distribution shifts abruptly at step 100. Monitor \(S\) through the shift and measure adaptation speed.
  4. Meta-learning rate comparison. Compare S-controlled meta-lr against fixed meta-lr across 5 optimisation landscapes. Report final loss after 500 steps.

Results

Quadratic: S-Controlled Matches Optimal

MethodFinal lossSteps to \(10^{-6}\)
Optimal fixed \(\eta = 0.1\)\(2.3 \times 10^{-8}\)142
S-controlled\(2.1 \times 10^{-8}\)145
Fixed \(\eta = 0.01\)\(1.8 \times 10^{-4}\)>500

On the smooth quadratic, S-controlled is within 2% of optimal: no overhead from adaptivity.

Rastrigin: S Drops Near Minima

RegionAvg \(S\)Avg \(\eta_t\)Behaviour
Far from minimum+0.720.072Fast exploration
Near local minimum+0.180.018Auto-slows
At global minimum+0.030.003Fine convergence

\(S\) naturally decreases near minima (forces approach balance), which automatically reduces the learning rate — no schedule tuning required.

Belief Regime Change

MetricFixed \(\eta\)S-controlled
\(S\) at step 100 (shift)n/a-4.7 (crash)
Steps to re-converge8542
Peak error after shift0.830.61

At the regime change, \(S\) crashes to \(-4.7\), triggering faster belief updates. The S-controlled agent re-converges in half the steps.

Meta-Learning Rate Comparison

LandscapeFixed meta-lrS-controlled meta-lr
Quadratic\(2.3 \times 10^{-8}\)\(2.1 \times 10^{-8}\)
Rosenbrock\(4.1 \times 10^{-3}\)\(1.8 \times 10^{-3}\)
Rastrigin3.982.14
Regime-shift0.420.19
Noisy quadratic\(5.6 \times 10^{-4}\)\(3.9 \times 10^{-4}\)

S-controlled meta-lr matches or beats fixed on all 5 landscapes, with the largest gains on non-stationary and multi-modal problems.

Analysis

  • No-cost adaptivity. On smooth landscapes, S-controlled adds negligible overhead (within 2% of optimal). The adaptivity only activates when needed.
  • Automatic annealing. Near minima, \(S\) decreases as forces balance, naturally reducing the step size. This is emergent annealing with no decay schedule.
  • Regime detection. The sharp \(S\) crash at regime changes provides an automatic trigger for faster adaptation. The agent does not need to know the change happened — \(S\) detects it.
  • Meta-lr wins are consistent. Across all tested landscapes, S-controlled never underperforms fixed, making it a safe default.

Conclusion

\(S\) functions as a general-purpose adaptive control signal. It matches optimal schedules on smooth problems, auto-anneals near minima on multi-modal problems, and detects regime changes in belief systems. The S-controlled meta-learning rate is a safe, schedule-free default that matches or beats fixed rates across all tested landscapes.

Reproducibility

../simplex/build/sxc exp_s_controller.sx -o build/exp_s_controller.ll

OPENSSL_PREFIX=$(brew --prefix openssl)
clang -O2 build/exp_s_controller.ll \
  ../simplex/runtime/standalone_runtime.c \
  -o build/exp_s_controller \
  -lm -lssl -lcrypto -L${OPENSSL_PREFIX}/lib

./build/exp_s_controller

Related