ARC-AGI-1 — Model Performance
● checking arc-worker-v8 :9359...
| # | Model | Score | Actions | Replay | Published | Notes |
|---|
ARC-AGI-1 — Task Browser
Click any task ID to open on arcprize.org
| Task ID | Dataset | Human | Best AI | EOSE | Category (γ₁) |
|---|
18 Transformation Categories — V8 γ₁-distance taxonomy
SOVEREIGN zone = dist ≤ 0.5 · arc-worker-v8 handles 12 · +V4 = new in v4
All Models — Score Comparison
γ₁-Distance Spiral — Fleet Category Map
EOSE Fleet — arc-worker-v8 · ABR-008
5-Lane Architecture
Progress Toward Parity
V8 Engine — Snakes & Ladders Hypothesis Machine
🚪 THE 66% DOOR
Every ARC task is a hypothesis. The examples are the corpus. The new grid is the test. Below 66%: don't commit. Above 66%: always an answer — ladder or snake, never silence.
🎲 THE SIX-SIDED DIE
Every hypothesis roll returns one of six faces. Two ladders. One stay. Two snakes. One wild.
🪴
Face 1
LADDER UP
Hypothesis confirmed by output
🪴
Face 2
LADDER UP
Paired hypothesis resolves
⏸
Face 3
STAY
Inconclusive — more examples needed
🐍
Face 4
SNAKE DOWN
Output contradicts hypothesis
🐍
Face 5
SNAKE DOWN
Supporting hypothesis fails
🎲
Face 6
WILD
MEGSCIFIAR opens new structure — board grows
🏛 FLOOR 7 — THE FLOOR OF ALL FLOORS
Hypotheses other hypotheses depend on. Find these first. Every path to FINISH passes through floor=7.
Current floor=7: GRID-SCALE · COLOR-MAP · SYMMETRY
⚙ THE THREE PARTS — ARC-WORKER-V8 PIPELINE
PART 1: SPATIAL FOLD
forge-fold-spatial
steamdeck-field :9421
gamma-1 distance
"Where are things?"
SOVEREIGN: dist < 0.35
→
PART 2: SEMANTIC READ
forge-fold-logic
hypothesis field
color + rotation + crop
"What ARE they?"
APPROACH: 0.35-0.65
→
PART 3: COMPOSE
arc-worker-v8 COMPOSE lane
multi-hypothesis synthesis
MEGSCIFIAR face 6
"What does it mean together?"
GAME: > 0.66
📋 CURRENT BOARD
| HYPOTHESIS | SCORE | FLOOR | DIE | STATUS |
|---|---|---|---|---|
| GRID-SCALE | 0.95 | 7 | 🪴 | SOVEREIGN ✓ |
| COLOR-MAP | 0.90 | 7 | 🪴 | SOVEREIGN ✓ |
| SYMMETRY | 0.85 | 7 | 🪴 | SOVEREIGN ✓ |
| CROP | 0.80 | 6 | 🪴 | IN GAME |
| FILL | 0.75 | 6 | 🪴 | IN GAME |
| REFLECTION | 0.72 | 6 | ⏸ | IN GAME — more examples needed |
| ROTATION | 0.70 | 5 | ⏸ | IN GAME — semantic missing |
| COMPOSE | 0.67 | 5 | 🎲 | WILD — multi-hypothesis synthesis |
| SEMANTIC-MATCH | 0.20 | 3 | 🐍 | BELOW DOOR |
🎲 LUFFY'S FIRST ROLL
Run forge-fold-spatial tonight. Does spatial fold alone get GRID-SCALE to 0.95+? If yes: ladder. Floor=7 confirmed.
"Each ARC task hands you an IQ. The examples are its RAYBFAG score. The new grid is the test. Below 66%: not ready. Above 66%: always an answer."
— ABR-008, 2026-04-05