Conjecture 6.7: Transfer Learning Threshold
Statement
For two tasks \( A \) and \( B \) parameterised by distributions \( p_A \) and \( p_B \), there exists a critical distance \( \delta^* \) such that transfer learning improves performance when \( |p_A - p_B| < \delta^* \) and degrades performance when \( |p_A - p_B| > \delta^* \).
Status: Validated
The crossover occurs at \( |p_A - p_B| \approx 0.15 \). Below this threshold, sharing interaction matrix structure between tasks accelerates convergence. Above it, the transferred structure introduces systematic bias that slows or prevents convergence.
Evidence Summary
The experiment sweeps task distance from 0.0 to 0.5 in steps of 0.01 and measures convergence time with and without transfer:
- \( |p_A - p_B| = 0.05 \): transfer saves 40% of convergence time
- \( |p_A - p_B| = 0.10 \): transfer saves 25%
- \( |p_A - p_B| = 0.15 \): crossover point — transfer has no net effect
- \( |p_A - p_B| = 0.20 \): transfer adds 15% to convergence time
- \( |p_A - p_B| = 0.30 \): transfer adds 45% — significant negative transfer
The crossover is clean and reproducible. The mechanism is that the interaction matrix learned from task \( A \) encodes gradient relationships specific to \( p_A \); when these relationships differ sufficiently from \( p_B \), the transferred structure creates gradient conflicts that the cosine-scaled projection must resolve, adding overhead.
Relevant Experiments
exp_memory_dynamics.sx— transfer dynamics across task distributionsexp_sensitivity.sx— robustness of the crossover point to hyperparameters
What This Means
This result provides a practical decision rule: measure the distribution distance between source and target tasks, and only transfer if the distance is below ~0.15. The threshold is not a universal constant (it depends on system dimensionality and contraction rates), but the existence of a sharp crossover is robust. This formalises the well-known practitioner intuition about "negative transfer" and gives it a quantitative basis within the adaptation framework.