Balance Residual (B-Flow) Convergence
Hypothesis
Gradient descent on the balance residual \( B(\theta) \) converges to equilibrium with dramatically higher precision than gradient descent on the loss. The balance residual is defined as:
\[ B(\theta) = \frac{\left\|\sum_i g_i(\theta)\right\|^2}{\sum_i \|g_i(\theta)\|^2} \]At equilibrium \( B = 0 \) (gradients cancel). B-flow descends on \( B \) directly rather than on individual losses.
Method
Setup: Three test problems of increasing complexity: 1D convex, 2D convex, and non-convex.
Parameters:
- 1D: 2 quadratic losses with different minima
- 2D: 2 quadratic losses in 2-dimensional parameter space
- Non-convex: losses with multiple local minima
- Steps: 1000 per method per problem
Procedure: For each problem, run B-flow (gradient descent on \( B \)) and loss-flow (gradient descent on \( \sum L_i \)) side by side. Compare the final balance residual achieved by each method.
Results
1D Problem
| Method | Final B |
|---|---|
| B-flow | \(0\) (exact) |
| Loss-flow | \(> 0\) |
2D Problem
| Method | Final B |
|---|---|
| B-flow | \(4.7 \times 10^{-34}\) |
| Loss-flow | \(2.3 \times 10^{-4}\) |
Precision ratio: B-flow is \( \approx 4.9 \times 10^{29} \) times more precise than loss-flow on the 2D problem.
Non-convex Problem
| Method | Outcome |
|---|---|
| B-flow alone | May converge to spurious B=0 (saddle point) |
| Loss-flow alone | Finds lower-loss region but imprecise equilibrium |
| Two-phase (loss then B) | Best of both: reaches correct basin, then refines to B ≈ 0 |
Analysis
- 1D exact (B=0): In the 1D convex case, B-flow finds the exact Pareto-optimal point where gradients cancel. The loss-flow settles for a compromise that does not fully balance the gradients.
- 2D near-exact (\(4.7 \times 10^{-34}\)): The 2D result is 30 orders of magnitude more precise than loss-flow's \(2.3 \times 10^{-4}\). This is because B-flow's gradient \( \nabla B \) points directly toward gradient cancellation, while loss-flow's gradient \( \nabla \sum L_i \) only incidentally approaches equilibrium.
- Non-convex two-phase: B-flow's weakness is that \( B = 0 \) can occur at saddle points or undesirable equilibria in non-convex landscapes. The two-phase strategy uses loss-flow to navigate to the correct basin, then switches to B-flow for high-precision refinement. This combines the global search of loss-flow with the precision of B-flow.
Conclusion
Pass — B-flow achieves exact equilibrium in 1D and \( 4.7 \times 10^{-34} \) in 2D, vastly outperforming loss-flow. The two-phase strategy handles non-convex landscapes. Theorem 14 is validated.
Reproducibility
../simplex/build/sxc exp_balance_residual.sx -o build/exp_balance_residual.ll
clang -O2 build/exp_balance_residual.ll ../simplex/runtime/standalone_runtime.c \
-o build/exp_balance_residual -lm -lssl -lcrypto -L$(brew --prefix openssl)/lib
./build/exp_balance_residual
Related
- Theorem 14 — B-Flow Convergence
- exp-iratio-proof — I-Ratio Theorem (Theorem 13)
- exp-composition — Full composed system