qwen2.5:7b AGI-1 training ×20 correct=0/20 cell≈0% texture collapse
qwen2.5:7b AGI-1 ×8editions ×20 correct=0/20 cell=0% rotation → flatline
qwen2.5:14b AGI-1 v8 bench ×5×54 avg 28.7% AGI-1 MIRAGE
↳ smaller repetitive grids · transformer texture matching · NOT reasoning
── AGI-2 RUNS ──────────────────────────────────────────────────────────────
qwen2.5:7b AGI-2 training ×20 correct=0/20 cell=21.6% WALL
qwen2.5:14b AGI-2 training ×20 correct=1/20 cell=32.8% 017c7c7b (tiling guess)
qwen2.5:14b AGI-2 EVAL ×10 correct=0/10 cell=8.9% ████ THE WALL ████
── RUN 4: OPTIC NERVE ACTIVE ────────────────────────────────────────────────
qwen2.5:14b AGI-2 training ×5 (objects) correct=0/5 cell=31.5% +22.6pp vs eval
↳ 009d5c81: 83.7% cell match · 3-object task · rule PERCEIVED · near-hit
↳ 007bbfb7: 51.9% cell match · 2-object task · transform 50% right
↳ 00d62c1b: 0.0% cell match · 15-object task · synthesis window exceeded
Raw grid → object descriptors = 3.5× cell match improvement.
009d5c81 at 83.7% proves the model CAN reason about spatial rules —
it just couldn't see the objects through 900 raw integers.
AUTOPSY — WHY THE WALL IS PERMANENT
AGI-1 at 25-35% was a mirage.
Small grids. Repetitive patterns. Colour fills with 2-3 objects max.
The transformer learned to pattern-match the texture of ARC training grids —
not the rules. Rotate any task by 90°: score drops to ~8%. That's the tell.
AGI-2 destroyed the mirage. 30×30 grids. Novel topologies per task.
No two tasks share a surface pattern. The model receives
900 integers and no geometry. Its attention mechanism
cannot localise object boundaries. It hallucinates pixel values with no
spatial grounding. Cell match on eval: 8.9%. Not even close
to the right shape.
017c7c7b was a tiling guess. That task's output was a
simple periodic repetition of the input pattern. The model got lucky on a
texture it'd seen variants of. Remove it and the training score is 0/19.
Rick's Law holds:
You cannot transform an object you cannot perceive.
The LLM is not the retina. It was never supposed to be the retina.
The Cathedral's formally verified Optic Nerve is the retina.
PHASE 3 — THE DSL HANDS
The Cathedral has eyes. Now it needs hands.
OVERSEER receives object descriptors. It synthesises a transform.
But that transform must be formally verified —
a Lean 4 DSL primitive, not a hallucinated pixel grid.
Two verbs to build:
Run 5 hypothesis: 009d5c81 goes
83.7% → 100% when OVERSEER outputs
translate_object obj 0 2
and Lean verifies bounds before applying.
The LLM has strong P_c (synthesis capability) and catastrophically weak S (spatial signal).
The Optic Nerve is the S booster. Object descriptors are strong signal.
Run 4 proved it: same model, same hardware, 3.5× better signal → 3.5× better cell match.