Experiment: Memory Dynamics

Source: exp_memory_dynamics.sx — Validates Conjectures 6.6, 6.7, 6.10

Hypothesis

Belief agents with memory require appropriate forgetting rates, and these rates depend on the environment. Specifically:

Stationary environments favour high retention (\(\lambda^* \approx 1\)); changing environments favour faster forgetting (\(\lambda^* < 1\)).
Meta-gradient descent can recover near-optimal \(\lambda\) without prior knowledge of the environment.
Transfer learning between related tasks has a threshold: below some task similarity \(B^*\), transfer hurts.
Self-referential beliefs (an agent observing its own belief as evidence) always increase error.
There exists a phase transition in the number of interacting beliefs.

Method

Five sub-experiments using Bayesian belief agents with exponential forgetting factor \(\lambda \in [0, 1]\). Forgetting discounts past observations: \(w_t = \lambda^{T-t}\). Meta-gradient adjusts \(\lambda\) online. All streams are Bernoulli with known parameters for ground-truth comparison.

Experiment 1: Optimal Forgetting Rate

Agent observes 500-step stream. Two conditions: stationary (\(p = 0.7\) throughout) and changing (\(p\) shifts from 0.7 to 0.3 at step 250). Grid search over \(\lambda \in [0.80, 1.00]\).

Environment	Optimal \(\lambda^*\)	Best Loss
Stationary	0.99	3.1 × 10^-4
Changing (shift at t=250)	0.93	8.7 × 10^-4

Result 1

Stationary environments favour near-perfect retention (\(\lambda^* = 0.99\)): every observation is informative and should be weighted equally. Changing environments require faster forgetting (\(\lambda^* = 0.93\)): old observations are misleading after the shift. This confirms Conjecture 6.6.

Experiment 2: Meta-Gradient Recovery of \(\lambda\)

Agent starts with \(\lambda = 0.50\) (far from optimal) and uses meta-gradient descent to adjust \(\lambda\) online. No knowledge of whether the environment is stationary or changing.

Environment	Initial \(\lambda\)	Recovered \(\lambda\)	Gap from \(\lambda^*\)
Stationary	0.50	0.981	0.009 (0.9%)
Changing	0.50	0.931	0.001 (0.1%)

Result 2

Meta-gradient recovers near-optimal forgetting rates from a poor initialisation. In the stationary case, \(\lambda\) converges to 0.981 (within 0.9% of the grid-search optimum 0.99). In the changing case, convergence to 0.931 (within 0.1% of 0.93). The meta-gradient is more accurate in the changing environment because the gradient signal is stronger when \(\lambda\) matters more.

Experiment 3: Transfer Learning Threshold

Agent trained on task A (\(p_A = 0.7\)) transfers beliefs to task B with varying similarity \(B = |p_B - p_A|\). Measures whether transferred prior helps or hurts versus a fresh start.

Task B Parameter	Similarity \(B\)	Transfer Effect	Verdict
\(p_B = 0.65\)	0.65 (high)	12% faster convergence	Helps
\(p_B = 0.50\)	0.50 (moderate)	2% slower convergence	Marginal hurt
\(p_B = 0.20\)	0.20 (low)	34% slower convergence	Significant hurt

Result 3

Transfer helps when task similarity \(B \geq 0.65\) and hurts when \(B \leq 0.20\). The crossover point is approximately \(B^* \approx 0.55\). Below this threshold, the transferred prior is sufficiently wrong that it takes longer to unlearn than to learn from scratch. This validates Conjecture 6.7: there exists a sharp transfer threshold dependent on task similarity.

Experiment 4: Self-Referential Belief

An agent feeds its own belief back as an additional observation, with coupling strengths \(\alpha_{\text{self}} \in \{0.1, 0.3, 0.5, 0.8\}\). Compared against baseline (no self-reference).

Self-coupling \(\alpha_{\text{self}}\)	Final Loss	vs Baseline
0.0 (baseline)	4.92 × 10^-4	—
0.1	5.31 × 10^-4	+7.9% worse
0.3	6.87 × 10^-4	+39.6% worse
0.5	9.14 × 10^-4	+85.8% worse
0.8	1.83 × 10^-3	+272% worse

Result 4: Self-Reference Always Hurts

All self-coupling strengths increase error. Even minimal self-reference (\(\alpha = 0.1\)) degrades performance by 7.9%. At \(\alpha = 0.8\), the agent locks into a self-confirming loop with 272% worse loss. Self-referential beliefs create a positive feedback loop: the agent double-counts its own uncertainty as evidence, amplifying noise. This validates Conjecture 6.10.

Experiment 5: Phase Transition in Belief Count

Varying the number of interacting belief agents \(K\) from 2 to 10, all observing correlated streams. Measuring mean error as a function of \(K\).

\(K\) (agents)	Mean Error	Notes
2	2.1 × 10^-4	Baseline
3	1.8 × 10^-4	Improvement
4	1.6 × 10^-4	Diminishing returns
5	1.5 × 10^-4	Near plateau
6	3.2 × 10^-4	Phase transition
8	5.7 × 10^-4	Degraded
10	8.1 × 10^-4	Strongly degraded

Result 5

A sharp phase transition occurs at \(K = 5\). Below this threshold, adding agents improves mean error (information benefit exceeds coordination cost). At \(K = 6\), error jumps from \(1.5 \times 10^{-4}\) to \(3.2 \times 10^{-4}\) — a 2.1× discontinuity. Beyond \(K = 5\), the interaction matrix becomes too large for the meta-gradient to optimise within the observation window, and spurious couplings dominate.

Analysis

The five sub-experiments validate three conjectures:

Conjecture 6.6 (Optimal Forgetting): Validated. The optimal forgetting rate \(\lambda^*\) depends on environmental stationarity. Meta-gradient recovers \(\lambda^*\) to within 1% from poor initialisation.
Conjecture 6.7 (Transfer Threshold): Validated. Transfer helps when similarity \(B \geq 0.65\), hurts when \(B \leq 0.20\), with crossover at \(B^* \approx 0.55\).
Conjecture 6.10 (Self-Referential Belief): Validated. Self-reference universally increases error due to positive feedback amplification. No safe coupling strength exists.

The phase transition at \(K = 5\) is an additional finding not directly predicted by the conjectures but consistent with the interaction matrix scaling analysis in Theorem 4.

Conclusion

All three conjectures validated. Stationary \(\lambda^* = 0.99\), changing \(\lambda^* = 0.93\), both recovered by meta-gradient. Transfer threshold at \(B^* \approx 0.55\). Self-reference always degrades (7.9% to 272%). Phase transition at \(K = 5\) agents. The memory dynamics of belief agents are fully characterised by the forgetting-stationarity tradeoff.

Reproducibility

# Clone and build
git clone https://github.com/senuamedia/lab.git
cd simplex && ./build.sh && cd ..

# Clone theorem-proof
git clone https://github.com/senuamedia/theorem-proof.git
cd theorem-proof

# Compile
../simplex/build/sxc exp_memory_dynamics.sx -o build/exp_memory_dynamics.ll

# Link with runtime
OPENSSL_PREFIX=$(brew --prefix openssl)
clang -O2 build/exp_memory_dynamics.ll \
  ../simplex/runtime/standalone_runtime.c \
  -I"$OPENSSL_PREFIX/include" \
  -L"$OPENSSL_PREFIX/lib" \
  -lssl -lcrypto -lm \
  -o build/exp_memory_dynamics

# Run
./build/exp_memory_dynamics

Related Theorems

Conjecture 6.6: Optimal Forgetting — validated
Conjecture 6.7: Transfer Learning Threshold — validated
Conjecture 6.10: Self-Referential Belief — validated
Theorem 6: Belief Flow — forgetting within belief dynamics
Theorem 4: Interaction Matrix — phase transition in \(K\)