Theorem 4

Interaction Matrix

An \( N \times N \) matrix of pairwise projection strengths, learnable via meta-gradient, that converges in 5 cycles and discovers asymmetric group structure.

Theorem Statement

For \( K \) objectives with gradients \( g_1, \ldots, g_K \), define the interaction matrix \( A \in \mathbb{R}^{K \times K} \) where entry \( A_{ij} \) controls the projection strength of objective \( j \)'s influence on objective \( i \)'s gradient:

\[ g_i' = g_i - \sum_{j \neq i} A_{ij} \cdot \frac{g_i \cdot g_j}{\|g_j\|^2} \, g_j \]

The matrix \( A \) is learnable via meta-gradient descent on the system's convergence rate. The meta-gradient update is:

\[ A_{ij} \leftarrow A_{ij} - \eta_{\text{meta}} \cdot \frac{\partial \mathcal{L}_{\text{meta}}}{\partial A_{ij}} \]

where \( \mathcal{L}_{\text{meta}} \) measures the convergence rate of the inner optimisation.

Proof Sketch

The interaction matrix generalises the scalar cosine-scaled projection (Theorem 2) to a per-pair basis. When \( A_{ij} = \alpha \cdot |\cos(g_i, g_j)| \) for all pairs, we recover Theorem 2. The meta-gradient allows the system to discover that some pairs need stronger projection (highly conflicting objectives) while others need less (cooperative objectives).

Convergence of the meta-learning follows from the interaction matrix having bounded entries and the meta-loss being smooth. In practice, the matrix stabilises within 5 meta-gradient cycles.

Key Properties

Converges in 5 cycles — the interaction matrix stabilises rapidly, much faster than the inner optimisation
Asymmetric structure — the learned \( A \) is generally not symmetric: objective \( i \) may project away objective \( j \) strongly while \( j \) barely projects away \( i \)
Group structure discovery — objectives that cooperate form groups with near-zero interaction weights (within-group \( \approx 0 \)), while competing groups have strong interaction (between-group \( \approx 0.785 \))

Group Structure Discovery

Interaction Type	Learned \( A_{ij} \)	Interpretation
Within cooperative group	\( \approx 0.0 \)	No projection needed (aligned gradients)
Between competing groups	\( \approx 0.785 \)	Strong projection (\( \pi/4 \) radians)
Weakly conflicting pair	\( \approx 0.2 \text{--} 0.4 \)	Partial projection

The between-group value of \( 0.785 \approx \pi/4 \) is not prescribed; it emerges from the meta-gradient as the optimal projection strength for typical inter-group conflict angles.

Convergence Timeline

Meta Cycle	\( \\|A^{(t)} - A^{(t-1)}\\|_F \)	Inner Convergence Rate
1	Large (initialisation)	Baseline
2	0.42	1.8x faster
3	0.11	2.5x faster
4	0.03	2.7x faster
5	0.004	2.8x faster (converged)

Structural Stability (Conjecture 6.9 — VALIDATED)

When the interaction matrix is perturbed by up to 20%, it recovers to within 1% of its original values within \( O(10) \) cycles. This structural stability means the discovered group topology is robust to noise and partial resets.

Experiment Files

exp_interaction_matrix.sx — Meta-gradient learning of interaction matrix, convergence in 5 cycles
exp_symmetry_breaking.sx — Group structure discovery, asymmetric interactions, perturbation recovery

← Back to Theorems