Theorem 14

B-Flow Convergence

Gradient descent on the balance residual converges to machine-precision equilibrium — 375 billion times more precise than minimising loss directly.

Theorem Statement

Define the balance residual for \( K \) objectives with gradient vectors \( g_1, \ldots, g_K \):

\[ B(\theta) = \frac{\left\|\displaystyle\sum_{i} g_i\right\|^2}{\displaystyle\sum_{i} \|g_i\|^2} \]

Then \( B(\theta) = 0 \) if and only if the system is at equilibrium (\( \sum g_i = 0 \)), and gradient descent on \( B \) (B-flow) converges to equilibrium with precision bounded only by machine epsilon.

Proof Sketch

\( B(\theta) \geq 0 \) with \( B = 0 \) iff \( \sum g_i = 0 \). Since \( B \) is a ratio of quadratic forms in the gradients, it is smooth wherever the denominator is nonzero (i.e., not all gradients vanish). The gradient \( \nabla_\theta B \) points toward reduced imbalance, and since \( B \) is bounded below by 0, gradient descent on \( B \) converges.

The critical advantage over loss-flow (minimising \( \sum L_i \)) is that B-flow directly targets the equilibrium condition rather than trying to reduce each loss independently. Loss-flow can stall when competing objectives create flat saddles; B-flow measures the residual imbalance and drives it to zero.

B-Flow vs Loss-Flow

Metric	B-Flow	Loss-Flow	Ratio
Final precision	\( 8.8 \times 10^{-16} \)	\( 3.3 \times 10^{-4} \)	\( 3.75 \times 10^{11} \times \) better
Reaches equilibrium	Yes (machine epsilon)	Stalls at saddle	—
I-ratio at convergence	\( I = -0.500000000000000 \)	\( I \approx -0.498 \)	—

Two-Phase Optimisation

The optimal strategy combines both flows:

Phase 1: Loss-flow — explore the loss landscape, reduce individual losses, find the basin of attraction. Fast initial progress.
Phase 2: B-flow — switch to balance residual minimisation once in the basin. Refine to machine-precision equilibrium.

The transition point is detected by monitoring \( I(\theta) \): when \( |I + 0.5| < \epsilon_{\text{switch}} \), the system is near enough to equilibrium for B-flow to take over.

Relationship to I-Ratio

The balance residual \( B \) and interaction ratio \( I \) are directly related:

\[ B(\theta) = 1 + 2I(\theta) \]

So \( B = 0 \iff I = -\frac{1}{2} \), and minimising \( B \) is equivalent to driving \( I \) toward its equilibrium value. B-flow is the gradient-based method for achieving the I-ratio theorem's prediction.

Experiment Files

exp_balance_residual.sx — B-flow vs loss-flow comparison, two-phase optimisation, precision measurement
exp_equilibrium_mapping.sx — B-flow equilibrium location across problem classes

← Back to Theorems