GRAIL PROTOCOL
AWAITING RUN 5 RESULTS · Load arc_overseer_run5_results.json below
The score comes from running it. Not from predicting it.
BASELINE · RAW MATRIX
900 integers → no grounding.
0/10 eval tasks solved.
The wall is the interface,
not the model.
RUN 4 · OPTIC NERVE
24.3%
CELL MATCH · REPRESENTATION
Objects extracted, semantics live.
3 near-misses (58–71%).
The model sees the board.
Motor controls: ❌
RUN 5 · DSL EXECUTION
PENDING
CELL MATCH · DSL HANDS
translate_object: wired
recolor_object: wired
is_monochrome: ✅ proven
[ load results to score ]
RUN 5 HYPOTHESES · RESOLUTION
H1
009d5c81 at 83.7% → 100%. 3-object task, rule perceived in Run 4, DSL gives motor control to finish.
— PENDING
H2
136b0064 / 1ae2feb7 / 16de56c4 improve to >85%. Gap was motor, not perception.
— PENDING
H3
Object complexity ceiling holds. >15-object tasks stay near zero. DSL fixes motor, not combinatorial search.
— PENDING
H4
Overall cell match > 40%. First correct eval task possible.
— PENDING
TASK BREAKDOWN · load results to populate
| TASK ID | CELL MATCH | BAR |
CORRECT | CMDS | ERRS | LATENCY |
| — awaiting arc_overseer_run5_results.json — |
GRAIL PROTOCOL · ABR-841 · γ₁ = 14.134725141734693 · The score comes from running it.