B-Flow Convergence
Theorem Statement
Define the balance residual for \( K \) objectives with gradient vectors \( g_1, \ldots, g_K \):
\[ B(\theta) = \frac{\left\|\displaystyle\sum_{i} g_i\right\|^2}{\displaystyle\sum_{i} \|g_i\|^2} \]Then \( B(\theta) = 0 \) if and only if the system is at equilibrium (\( \sum g_i = 0 \)), and gradient descent on \( B \) (B-flow) converges to equilibrium with precision bounded only by machine epsilon.
Proof Sketch
\( B(\theta) \geq 0 \) with \( B = 0 \) iff \( \sum g_i = 0 \). Since \( B \) is a ratio of quadratic forms in the gradients, it is smooth wherever the denominator is nonzero (i.e., not all gradients vanish). The gradient \( \nabla_\theta B \) points toward reduced imbalance, and since \( B \) is bounded below by 0, gradient descent on \( B \) converges.
The critical advantage over loss-flow (minimising \( \sum L_i \)) is that B-flow directly targets the equilibrium condition rather than trying to reduce each loss independently. Loss-flow can stall when competing objectives create flat saddles; B-flow measures the residual imbalance and drives it to zero.
B-Flow vs Loss-Flow
| Metric | B-Flow | Loss-Flow | Ratio |
|---|---|---|---|
| Final precision | \( 8.8 \times 10^{-16} \) | \( 3.3 \times 10^{-4} \) | \( 3.75 \times 10^{11} \times \) better |
| Reaches equilibrium | Yes (machine epsilon) | Stalls at saddle | — |
| I-ratio at convergence | \( I = -0.500000000000000 \) | \( I \approx -0.498 \) | — |
Two-Phase Optimisation
The optimal strategy combines both flows:
- Phase 1: Loss-flow — explore the loss landscape, reduce individual losses, find the basin of attraction. Fast initial progress.
- Phase 2: B-flow — switch to balance residual minimisation once in the basin. Refine to machine-precision equilibrium.
The transition point is detected by monitoring \( I(\theta) \): when \( |I + 0.5| < \epsilon_{\text{switch}} \), the system is near enough to equilibrium for B-flow to take over.
Relationship to I-Ratio
The balance residual \( B \) and interaction ratio \( I \) are directly related:
\[ B(\theta) = 1 + 2I(\theta) \]So \( B = 0 \iff I = -\frac{1}{2} \), and minimising \( B \) is equivalent to driving \( I \) toward its equilibrium value. B-flow is the gradient-based method for achieving the I-ratio theorem's prediction.
Experiment Files
exp_balance_residual.sx — B-flow vs loss-flow comparison, two-phase optimisation, precision measurement
exp_equilibrium_mapping.sx — B-flow equilibrium location across problem classes