RUN LOG — ALL EXPERIMENTS · HONEST SCORES
| Model | Dataset | Tasks / Correct | Cell Match | Verdict |
|---|---|---|---|---|
| qwen2.5:7b | AGI-1 training (base) | 20 / 0 | 19.7% | Pattern Noise |
| qwen2.5:7b | AGI-1 training ×8 editions | 160 / 0 | 0.0% | Memorised, Not Learned |
| qwen2.5:14b | AGI-2 training | 20 / 1 (5%) | 0.3% | First Blood — 017c7c7b ✓ |
| qwen2.5:14b | AGI-2 eval | 10 / 0 | 0.1% ← THE WALL | Bounced |
| qwen2.5:7b | AGI-2 training | 20 / 0 | 0.2% | Bounced |
TERMINAL DIAGNOSTIC:
The 7b hit 19.7% cell match on base AGI-1 — pure memorisation.
The moment you added 8× parameter variation, it flatlined to 0.0%.
It didn't learn to reason; it memorised the training pixels.
Memory broke the instant the geometry shifted.
The 14b hit AGI-2 eval at 0.1% cell match on 30×30 grids. Mathematically blind. Not a parameter failure — a perception failure. The model cannot perceive discrete topological objects. Pumping 14B parameters at it generates 14B wrong guesses.
Conclusion: You cannot scale your way out of this. The architecture must change.
The 14b hit AGI-2 eval at 0.1% cell match on 30×30 grids. Mathematically blind. Not a parameter failure — a perception failure. The model cannot perceive discrete topological objects. Pumping 14B parameters at it generates 14B wrong guesses.
Conclusion: You cannot scale your way out of this. The architecture must change.
THE PIVOT — objects() STRATEGY (IRF-ARC-DSL-003)
1
Lean 4 parses the grid — BFS flood-fill extracts
List (ARC_Object g). 900 integers → N discrete entities.
Dimensionality collapses. Signal spikes.
2
LLM receives object representation — not pixel arrays.
"Object at (5,5), color 2, bbox (3×3), 9px"
3
LLM outputs a single Lean 4 line:
translate_object obj1 2 0
4
Lean 4 compiler verifies it. Proof-checked. Done.
That is how you crack the 30×30 wall.
ACTIVE BURN: IRF-ARC-DSL-003
THE OPTIC NERVE. Abandoning raw grid inference. Deterministic flood-fill object extraction in Lean 4. LLM acts only on verified ARC_Object topologies — bypassing the 30×30 noise wall entirely.
structure ARC_Object {H W : ℕ}
(g : ARC_Grid H W) where
color : ARC_Color
pixels : Finset (ARC_Cell H W)
is_not_empty : pixels.Nonempty
is_monochrome : ∀ p ∈ pixels,
g p = color
is_connected : ∀ p1 ∈ pixels, ∀ p2 ∈ pixels,
connected4_in_finset pixels p1 p2
is_maximal : ∀ p, g p = color →
(∃ p' ∈ pixels, adjacent4 p p') →
p ∈ pixels
def extract_objects ... : List (ARC_Object g) :=
sorry -- OVERSEER synthesizes BFS here
def translate_object
(obj : ARC_Object g) (dx dy : Int) :
Option (Finset (ARC_Cell H W)) := ...
(g : ARC_Grid H W) where
color : ARC_Color
pixels : Finset (ARC_Cell H W)
is_not_empty : pixels.Nonempty
is_monochrome : ∀ p ∈ pixels,
g p = color
is_connected : ∀ p1 ∈ pixels, ∀ p2 ∈ pixels,
connected4_in_finset pixels p1 p2
is_maximal : ∀ p, g p = color →
(∃ p' ∈ pixels, adjacent4 p p') →
p ∈ pixels
def extract_objects ... : List (ARC_Object g) :=
sorry -- OVERSEER synthesizes BFS here
def translate_object
(obj : ARC_Object g) (dx dy : Int) :
Option (Finset (ARC_Cell H W)) := ...
[ SYSTEM ] : AWAITING OBJECTS() ENUMERATION
[ STATUS ] : [ RUNNING — BFS SYNTHESIS ]
[ STATUS ] : [ RUNNING — BFS SYNTHESIS ]
IRF-ARC-DSL REGISTRY
✅
DSL-001
ARC_Grid, ARC_Color, ARC_Cell, 10-color finite type
✅
DSL-002
Translation — shift_cell, Option walls, translate_object
🔵
DSL-003
THE OPTIC NERVE — ARC_Object (4 axioms) + extract_objects BFS = OVERSEER synthesis target
🔴
DSL-004
flood_fill — BFS + termination proof
✅
DSL-005
reflect_h, reflect_v — involutive proofs closed
🔴
DSL-006
Synthesis engine — find f | f(X₁)=Y₁ ∧ f(X₂)=Y₂
✅
DSL-007
map_colors / recolor — src→dst mapping closed
✅
DSL-008
overlay — object permanence, layer wins
🔴
DSL-009
rotate_90 — H=W square constraint