ARC-AGI V8 — Snakes & Ladders · Hypothesis Engine

ARC-AGI-1 — Model Performance

● checking arc-worker-v8 :9359...

#	Model	Score	Actions	Replay	Published	Notes

ARC-AGI-1 — Task Browser

Click any task ID to open on arcprize.org

Task ID	Dataset	Human	Best AI	EOSE	Category (γ₁)

18 Transformation Categories — V8 γ₁-distance taxonomy

SOVEREIGN zone = dist ≤ 0.5 · arc-worker-v8 handles 12 · +V4 = new in v4

All Models — Score Comparison

γ₁-Distance Spiral — Fleet Category Map

EOSE Fleet — arc-worker-v8 · ABR-008

5-Lane Architecture

Progress Toward Parity

V8 Engine — Snakes & Ladders Hypothesis Machine

🚪 THE 66% DOOR

Every ARC task is a hypothesis. The examples are the corpus. The new grid is the test. Below 66%: don't commit. Above 66%: always an answer — ladder or snake, never silence.

🎲 THE SIX-SIDED DIE

Every hypothesis roll returns one of six faces. Two ladders. One stay. Two snakes. One wild.

🪴

Face 1

LADDER UP

Hypothesis confirmed by output

🪴

Face 2

LADDER UP

Paired hypothesis resolves

⏸

Face 3

STAY

Inconclusive — more examples needed

🐍

Face 4

SNAKE DOWN

Output contradicts hypothesis

🐍

Face 5

SNAKE DOWN

Supporting hypothesis fails

🎲

Face 6

WILD

MEGSCIFIAR opens new structure — board grows

🏛 FLOOR 7 — THE FLOOR OF ALL FLOORS

Hypotheses other hypotheses depend on. Find these first. Every path to FINISH passes through floor=7.

Current floor=7: GRID-SCALE · COLOR-MAP · SYMMETRY

⚙ THE THREE PARTS — ARC-WORKER-V8 PIPELINE

PART 1: SPATIAL FOLD

forge-fold-spatial

steamdeck-field :9421

gamma-1 distance

"Where are things?"

SOVEREIGN: dist < 0.35

→

PART 2: SEMANTIC READ

forge-fold-logic

hypothesis field

color + rotation + crop

"What ARE they?"

APPROACH: 0.35-0.65

→

PART 3: COMPOSE

arc-worker-v8 COMPOSE lane

multi-hypothesis synthesis

MEGSCIFIAR face 6

"What does it mean together?"

GAME: > 0.66

📋 CURRENT BOARD

HYPOTHESIS	SCORE	FLOOR	DIE	STATUS
GRID-SCALE	0.95	7	🪴	SOVEREIGN ✓
COLOR-MAP	0.90	7	🪴	SOVEREIGN ✓
SYMMETRY	0.85	7	🪴	SOVEREIGN ✓
CROP	0.80	6	🪴	IN GAME
FILL	0.75	6	🪴	IN GAME
REFLECTION	0.72	6	⏸	IN GAME — more examples needed
ROTATION	0.70	5	⏸	IN GAME — semantic missing
COMPOSE	0.67	5	🎲	WILD — multi-hypothesis synthesis
SEMANTIC-MATCH	0.20	3	🐍	BELOW DOOR

🎲 LUFFY'S FIRST ROLL

Run forge-fold-spatial tonight. Does spatial fold alone get GRID-SCALE to 0.95+? If yes: ladder. Floor=7 confirmed.

"Each ARC task hands you an IQ. The examples are its RAYBFAG score. The new grid is the test. Below 66%: not ready. Above 66%: always an answer."

— ABR-008, 2026-04-05

← arc-v7 ← arc-leaderboard