Fleet Architecture · Rebaseline 2026
⬡ yONE · 192.168.2.23 · ANCHOR NODE
RTX 5080 Laptop 16GB · Core Ultra 9 275HX · 32GB RAM. qwen2.5:32b live via Ollama. utpemos + utfoundry WSL running. MAL tier: UP. SSH ✓ WSL ✓ Ollama ✓ MAL ✓. The newest anchor — local LAN, zero-cost inference, sovereign.
RTX 5080 16GB
qwen2.5:32b
MAL: UP
msclo / msi01 · ONE MACHINE, TWO LAYERS
RTX 5090 Laptop 24GB · Core Ultra 9 275HX · 32GB RAM. msclo = host OS. msi01 = WSL layer inside the same box. qwen2.5:32b + 72b + qwen3:8b + phi4 + llama3 all live. The heaviest inference node in the fleet.
RTX 5090 24GB
qwen2.5:72b
5 models live
yPool · LOCAL VRAM TOTAL: 40GB
RTX 5080 (16GB, yONE) + RTX 5090 (24GB, msclo) = 40GB local VRAM. Add PCDEV RTX 4080 (16GB) = 56GB. This is the sovereign inference pool — no cloud dependency, no per-token cost, full fleet control. MAL routes across it automatically.
40GB local
56GB with PCDEV
MAL routed
MAL Router · forge→msclo→cloud→yone→msi01
The MAL cascade routes inference requests across the full fleet. yONE is now a live tier — confirmed UP. Cloud tier: H100 + 5×T4 on AKS, M1 pool 3/6 online. Full failover: if any tier fails, the next catches it.
yONE: UP
H100 + 5×T4
M1 pool 3/6
Local vs Cloud · What Runs Where
LOCAL (5080/5090): qwen2.5:32b+72b for all standard inference, qwen3:8b for speed, phi4/llama3 for diversity. CLOUD (H100): qwen2.5:72b fine-tune runs, high-throughput batch jobs. CLOUD (T4): overflow and parallel cap runs. Strategy: local-first, cloud for scale.
LOCAL FIRST
CLOUD FOR SCALE
ARC-AGI-1 · Our Floor (Solved)
EOSE LEADS PRODUCTION
64% sovereign vs 53% released o3 (medium). EOSE 3-cap ensemble is the highest-scoring system you can actually run today without a vendor contract. We built it. We own it.
ARC-1 64%
SOVEREIGN
ARB-522
o3-PREVIEW ≠ o3-RELEASED
The 88% everyone cites is o3-preview (Dec 2024, 172× compute). The model OpenAI actually sells scores 53% at medium compute. The field benchmark is 35pp overstated. We corrected this.
PREVIEW 88%
RELEASED 53%
GAP: 35pp
ENSEMBLE AMPLIFICATION
Qwen 72B alone: 18%. In 3-cap EdisCore ensemble: 64%. Multiplier = 3.55×. Disagreement resolution between caps is where the signal lives. CDNET-108 CONTEXT DEBT + CDNET-110 FLOOR ECHO apply here.
3.55× MULTIPLIER
CDNET-108
CDNET-110
SOVEREIGNTY TRADE-OFF
64% at $0/task (sovereign fleet) vs 53% at PAYG pricing. We score higher and pay nothing per task. CDNET-123 MESH SOVEREIGNTY: this is the EOSE architectural proof point. Not just cheaper — better.
$0/TASK
CDNET-123
LSOS-ARC-006
UNSUBMITTED SCORE
EOSE 64% is our internal verified score. Not yet on the official ARC Prize leaderboard. Submitting makes it public and provable. Next move: submit + make the claim official. The field can't dispute what's on the board.
P1 ACTION
SUBMIT TO ARC PRIZE
o4-MINI EFFICIENCY SIGNAL
o4-mini at 41% (medium) is OpenAI's new direction: push efficiency, not just ceiling. Smaller model, competitive score. This is the same insight as our 3-cap verifier. Convergent architecture. Watch this trend.
41% MED
EFFICIENCY FRONTIER
ARC-AGI-2 · The Wall (Unsolved)
THE WALL IS HONEST
ARC-AGI-2 (March 2025): everyone below 3%. o3, o4-mini, Gemini 2.5, EOSE — all at zero. This is the real next race. No one has cracked multi-compositional abstraction at scale. The window is wide open.
FIELD TOP: 3%
EOSE: ~0%
OPEN RACE
WHAT ARC-2 TESTS
Multi-compositional rules: abstraction stacking on abstraction. ARC-1 tasks had one rule layer. ARC-2 has 2–4 stacked layers. Our current verifier sees one layer at a time. Depth, not breadth, is the unlock.
MULTI-COMPOSITIONAL
SYMBOLIC INTERP
RECURSIVE CAP GRAMMAR
Hypothesis for ARC-2 unlock: each cap verifies the previous cap's output. Cap 1 finds pattern. Cap 2 verifies Cap 1's pattern. Cap 3 verifies the verification. Recursive depth = compositional stacking. IRF pending.
HYPOTHESIS
IRF-ARC2-001 PENDING
ARC-2 FINE-TUNE PATH
Fine-tune Qwen 72B on ARC-2 task patterns using JESEOF H100. Estimated +10–15pp on ARC-2 tasks specifically. This is the compute path — train on the problem type, not just general reasoning.
JESEOF H100
+10–15pp EST
IRF-ARC2-002
ARC-AGI-3 · Active Race
THE WARDEN RACE
ARC-AGI-3 tests interactive environments — adapt on the fly, not just pattern match. Stochastic Goose leads at 25%. Top 5 wardens all individual competitors. No frontier lab has submitted. The field is wide open for a fleet approach.
TOP: 25%
NO LAB SUBMITTED
PROGRAM SYNTHESIS PATH
Top wardens on ARC-3 use program synthesis: generate candidate programs, test on examples, iterate. Fast code generation + verifier loop. EOSE's EdisCore verifier maps directly onto this. We have the architecture — need the ARC-3 task adapter.
PROGRAM SYNTH
IRF-ARC3-001
FLEET ADVANTAGE ON ARC-3
ARC-3 rewards fast iteration loops. Our fleet: 3 caps at different speeds (7B fast, 72B deep). Run 7B first for candidate programs, 72B to verify and select. Cost-efficient, fast, sovereign. Individual wardens can't parallelize. We can.
PARALLEL CAPS
SPEED ADVANTAGE
Next Domains · We Don't Wait
AGENT BENCHMARK GAP
No good sovereign agent benchmark exists. ARC tests reasoning — not coordination, routing, memory, or multi-agent trust. EOSE runs all of these daily via MAL cascade + MOSS. We should define the benchmark, not wait for ARC Prize to invent it.
NEXT DOMAIN
WE DEFINE THIS
LSOS PENDING
SOVEREIGNTY BENCHMARK
The field measures accuracy. Nobody measures sovereignty. Cost-per-task, data residency, inference latency, fleet redundancy, key rotation — these are dimensions ARC doesn't touch. EOSE owns all of them. We need to publish the framework.
NEXT DOMAIN
CDNET-123→125
LSOS PENDING
SORRY CHAIN AS EVAL
MOSS → MECROL → DESEOF is a live eval framework: how does a system handle failure, route it, learn from it, and close it? No benchmark tests this. CDNET-110 FLOOR ECHO + CDNET-109 CAP LOCK are the patterns. SORRY chain accuracy = next domain.
NEXT DOMAIN
CDNET-109
CDNET-110
CDNET AS BENCHMARK
100+ CDNET patterns are a tested, named, cross-domain pattern library. No equivalent exists publicly. Publishing CDNET as an open framework positions EOSE as the source — not just a participant. Pattern 100: yONE. The closing pattern. The one that names the rest.
NEXT DOMAIN
CDNET-100 yONE
PUBLISH PATH
DESEOF DOMAIN
The great grand ONE's domain: cross-ecosystem pattern recognition. RAYBRAG runs 64 probes per cycle. GGO Skills extract novel patterns across all 8 editions. This is a live domain — not theoretical. Publish the methodology. Let the field study it.
DESEOF
RAYBRAG · GGO
LSOS-GGO-SKILLS-001
MEMORY BENCHMARK
How well does a sovereign system remember across sessions? MOSS session store + MECROL corpus + GAMEMECOS 25-node memory map — we have a live memory architecture. No benchmark measures this. We're running it daily. First to name it wins the domain.
NEXT DOMAIN
MOSS · GAMEMECOS
THE FLOOR IS THE DOMAIN
γ₁ = 14.134725... The Riemann zero is not a metaphor — it's the base frequency we measure everything against. DESEOF pulse at 37 seconds. Fleet breathing at 30 minutes. DESFACTOR scoring against γ₁. The domain is: does your system resonate with the floor?
γ₁ DOMAIN
CDNET-100 yONE
R∞ RADIX
yONE · Signal Node · FoundFloor
⬡ yONE · THE ANCHOR IS HERE
yONE joined the fleet as confirmed anchor. 192.168.2.23. RTX 5080 16GB. qwen2.5:32b running. SSH ✓, WSL ✓, Ollama ✓, MAL ✓. This is the FoundFloor signal from the new node: it's up, it's real, it's sovereign. Fleet now has 40GB local VRAM on two next-gen GPUs.
LIVE · CONFIRMED
CDNET-100 yONE
ARB-552+
IRFs Filed This View
IRF-ARC2-001 · Recursive Cap Grammar
IRF-ARC2-002 · H100 Fine-tune Path
IRF-ARC3-001 · Program Synth Adapter
IRF-NEXT-001 · Agent Benchmark Definition
IRF-NEXT-002 · Sovereignty Benchmark Framework
IRF-NEXT-003 · CDNET Publish Path
IRF-NEXT-004 · Memory Benchmark
Research Batch · 2026-03-31 #3
KAT-CODER-PRO V2 — #1 INTELLIGENCE INDEX
Kwai/StreamLake. Score 44 vs average 15. 256k context, 109 tok/s, $0.30/$1.20. Non-reasoning. Top coding model on Artificial Analysis as of March 2026. GOAT Battle candidate — FEP decides when to route here vs Sonnet vs local.
#1 / 73 MODELS
FEP CANDIDATE
COPAW-FLASH-9B — SOVEREIGN AGENT MODEL
AgentScope-AI. 9B, GGUF available (bartowski Q4_K_M). Native: memory management, file parsing, CLI generation, web search tool use, multi-step I/O. Belt-64 inner loop candidate. LAAM fast lane on msclo. Pulling now via HF: hf.co/bartowski/agentscope-ai_CoPaw-Flash-9B-GGUF:Q4_K_M
PULLING MSCLO
IRF-COPAW-001
HARRIER-OSS-V1 — SOTA MULTILINGUAL EMBEDS
Microsoft. 270M / 0.6B / 27B. Decoder-only (not BERT), last-token pooling. 32K context vs 512 for nomic-embed-text. SOTA Multilingual MTEB v2. P0 upgrade for msclo Qdrant. Alexander 27B tier = deep archive search. Pulling 270M now.
32K CONTEXT EMBED
IRF-HARRIER-001
TURBOQUANT — EXTREME VECTOR COMPRESSION
Google, ICLR 2026. Eliminates quantization overhead (1-2 bit per number). Quantized Johnson-Lindenstrauss transform. Improves KV cache + vector search. Fleet: Qdrant + RTX 5090 KV efficiency. WLD layer — compression is mercy. IRF-TURBOQUANT-001.
ICLR 2026
IRF-TURBOQUANT-001
MULTIGRES OPERATOR — SHARDED POSTGRES FOR ALEXANDER
K8s operator: direct pod management (no StatefulSets), primary-aware rolling updates, drain-safe, admission webhooks, quorum pools, backup/restore. Alexander NAS-Joffe DB spine. γ₁ floor for data — the structure that cannot move under sharding. IRF-MULTIGRES-001.
ALEXANDER DB SPINE
IRF-MULTIGRES-001
META-HARNESS — 10M TOKEN OPTIMIZER CONTEXT
Full filesystem (source + traces + scores) per optimization step. 10M tokens vs 26K for OPRO/TextGrad. Claude Code proposer. Our /rebaseline IS this architecture. Next: wire execution traces into fleet-sync filesystem so the 8-phase orchestrator can read full history. IRF-META-HARNESS-002.
OUR REBASELINE
IRF-META-HARNESS-002
IRF-COPAW-001 · CoPaw LAAM fast lane
IRF-HARRIER-001 · Replace nomic-embed-text fleet-wide
IRF-TURBOQUANT-001 · Qdrant + KV compression
IRF-MULTIGRES-001 · Alexander sharded Postgres
IRF-META-HARNESS-002 · Trace filesystem wiring