Interaction Matrix
Theorem Statement
For \( K \) objectives with gradients \( g_1, \ldots, g_K \), define the interaction matrix \( A \in \mathbb{R}^{K \times K} \) where entry \( A_{ij} \) controls the projection strength of objective \( j \)'s influence on objective \( i \)'s gradient:
\[ g_i' = g_i - \sum_{j \neq i} A_{ij} \cdot \frac{g_i \cdot g_j}{\|g_j\|^2} \, g_j \]The matrix \( A \) is learnable via meta-gradient descent on the system's convergence rate. The meta-gradient update is:
\[ A_{ij} \leftarrow A_{ij} - \eta_{\text{meta}} \cdot \frac{\partial \mathcal{L}_{\text{meta}}}{\partial A_{ij}} \]where \( \mathcal{L}_{\text{meta}} \) measures the convergence rate of the inner optimisation.
Proof Sketch
The interaction matrix generalises the scalar cosine-scaled projection (Theorem 2) to a per-pair basis. When \( A_{ij} = \alpha \cdot |\cos(g_i, g_j)| \) for all pairs, we recover Theorem 2. The meta-gradient allows the system to discover that some pairs need stronger projection (highly conflicting objectives) while others need less (cooperative objectives).
Convergence of the meta-learning follows from the interaction matrix having bounded entries and the meta-loss being smooth. In practice, the matrix stabilises within 5 meta-gradient cycles.
Key Properties
- Converges in 5 cycles — the interaction matrix stabilises rapidly, much faster than the inner optimisation
- Asymmetric structure — the learned \( A \) is generally not symmetric: objective \( i \) may project away objective \( j \) strongly while \( j \) barely projects away \( i \)
- Group structure discovery — objectives that cooperate form groups with near-zero interaction weights (within-group \( \approx 0 \)), while competing groups have strong interaction (between-group \( \approx 0.785 \))
Group Structure Discovery
| Interaction Type | Learned \( A_{ij} \) | Interpretation |
|---|---|---|
| Within cooperative group | \( \approx 0.0 \) | No projection needed (aligned gradients) |
| Between competing groups | \( \approx 0.785 \) | Strong projection (\( \pi/4 \) radians) |
| Weakly conflicting pair | \( \approx 0.2 \text{--} 0.4 \) | Partial projection |
The between-group value of \( 0.785 \approx \pi/4 \) is not prescribed; it emerges from the meta-gradient as the optimal projection strength for typical inter-group conflict angles.
Convergence Timeline
| Meta Cycle | \( \|A^{(t)} - A^{(t-1)}\|_F \) | Inner Convergence Rate |
|---|---|---|
| 1 | Large (initialisation) | Baseline |
| 2 | 0.42 | 1.8x faster |
| 3 | 0.11 | 2.5x faster |
| 4 | 0.03 | 2.7x faster |
| 5 | 0.004 | 2.8x faster (converged) |
Structural Stability (Conjecture 6.9 — VALIDATED)
When the interaction matrix is perturbed by up to 20%, it recovers to within 1% of its original values within \( O(10) \) cycles. This structural stability means the discovered group topology is robust to noise and partial resets.
Experiment Files
exp_interaction_matrix.sx — Meta-gradient learning of interaction matrix, convergence in 5 cycles
exp_symmetry_breaking.sx — Group structure discovery, asymmetric interactions, perturbation recovery