TRB-LOCO-ARC-001 · LOCO Sovereign Test Framework + ARC-AGI-3

SECTION I.

WHAT IS LOCO

Language of Continuous Observation. The fleet's internal observation layer. Not filed in TRB until now because it grew from practice, not a single design moment. That is why it needs THIS TRB — to canonise what already exists.

LOCO is the fleet's eyes. Without LOCO, tests run blind. Without LOCO, the harness has no ground truth. LOCO observes every silo continuously — not on demand, not on cron — continuously. That is the distinction. That is why it is sovereign.

LOCO PAGES — CANONICAL LIST

The LOCO Fleet

loco-harness — The command deck. All silos. All observation channels. The master LOCO view.
loco-galaxy — Fleet-wide view. All 7 silos in a single canvas. The galaxy perspective.
loco-showcase — Silo showcase. Per-silo observation highlights. What LOCO catches.
loco-sovereign-showcase — Sovereign edition. The doctrine in action. γ₁-anchored.
loco-forge — Forge silo observation. Build signals, CI triggers, git events.
loco-msclo — MSCLO silo. Code intelligence observation layer.
loco-yone — yone silo. qdrant, PEMCLAU, vector space signals.
loco-pcdev — pcdev silo. Development signals, test results.
loco-steamdeck — steamdeck silo. Gaming + ARC harness observation.
loco-lounge — lounge silo. The ambient layer. Background signals.
loco-eose-dev — eose-dev silo. The edge silo. External signals ingest.

SECTION II.

WHY LOCO BELONGS IN TRB

Every test harness needs a canonical filing. LOCO observes the fleet continuously. It is the eyes. Without eyes, the test harness is blind.

THE CORE ARGUMENT

Eyes Before Tests

LOCO grew organically — from the forge page, to the silo pages, to the galaxy view. It was never designed in one sitting. It was assembled from observation practice. That is its strength. That is also why it was never TRB-filed.

The V12 test harness changes that. You cannot run a sovereign test suite without a continuous observation layer. Postman has Newman — a runner. Hazelcast has Management Center — a watcher. LOCO is our watcher. It needs to be in the canon before the harness goes live.

This TRB canonises LOCO as TRB filing #43 and names it as an integral component of the V12 sovereign test suite.

SECTION III.

ARC-AGI-3 AS SOVEREIGN TEST SUITE

25 interactive games. Each game tests a different cognitive substrate. Not pattern matching — genuine problem solving. RHAE (Relative Human Action Efficiency) is the metric: agent actions ÷ human baseline. 100% = human parity. >100% = more efficient than the human baseline.

GENUINE (1 game) — Pure state-reading · No vision required

FT09 = pure state-reading. Sprite attrs are human-named. The agent reads them directly. 85 actions, 208 human baseline, 115% RHAE. WIN. This is the proof of concept for the entire ARC chapter.

FT09 · WIN · 115% RHAE · 85 actions

WORMHOLE_NEEDED (6 games) — Step budget gated · Vision layer = unlock

These games require visual state reading. The step budget exhausts before the puzzle is solved because the agent clicks blind. camera.render() → minicpm-v → coordinate mapping = WATCHING → STRONG.

LP85 R11L TN36 VC33 S5I5 SU15

DEEP_SOLVE (18 games) — Vision + planning required · Next sprint target

Complex state spaces. Physics simulations. Multi-step planning. The vision layer is necessary but not sufficient. These require the full vision + reasoning + action loop.

SB26CD82TR87 TU93G50TLS20 AR25BP35CN04 DC22KA59LF52 M0R0RE86SC25 SK48SP80WA30

SECTION IV.

THE VISION WORMHOLE

camera.render() → minicpm-v → coordinate mapping. This is RH1 in the harness. The rendered pixel frame IS the ground truth.

RH1 — VISION LAYER SPECIFICATION

The Missing Wormhole

camera.render() — Pygame's internal camera produces an exact numpy array of the frame the human sees. Not a screenshot. Not a capture. The actual render buffer.

→ minicpm-v — Local vision model on msi01 Ollama. No cloud. No H100. Sovereign inference. Query: "What needs to move where to solve this puzzle? Give me x,y coordinates."

→ coordinate mapping — The model's answer is parsed. Coordinates are mapped to game canvas space. Optimal clicks are executed. Step budget is respected.

FT09 proved the general approach: reading internal state beats blind clicking by 215% (85 vs 208 actions). The vision layer extends this from named attributes to pixel-level state. WATCHING → STRONG.

SECTION V.

INTEGRATION DOCTRINE

LOCO tests the fleet's internal pipeline. ARC tests the fleet's external cognitive capability. Together: the two-track harness. Internal truth + External frontier = sovereign self-knowledge.

THE TWO-TRACK MODEL

Internal · External · Sovereign

Track 1 — LOCO (Internal): Continuous observation. Every silo. Every signal. The fleet's heartbeat is visible. Drift is caught before it becomes failure. This is the γ₁→HA→DRG→CRUD→INTENT→SELF-REFLECT pipeline operating on the fleet itself.

Track 2 — ARC-AGI-3 (External): 25 games. Each a different cognitive challenge. No fleet data — pure reasoning under uncertainty. This tests the agent's frontier capability without any home-field advantage.

Convergence: LOCO uses the same vision doctrine as the ARC vision layer. camera.render() feeds LOCO's silo dashboards. The same feed, routed differently, solves ARC games. That is structural isomorphism in the test harness itself.

SECTION VI.

VERDICT

RATIFIED

TRB-LOCO-ARC-001 enters the canon.
LOCO is the 43rd TRB entry.
ARC-AGI-3 is Gisboon element #59 in the test harness.
The vision layer is the next sprint target.

CLO Bench: Harvey · Ruth · Cochran · Amani · Day 88 · 2026-05-02