⚓ EOSE POSITION
64%
ARC-AGI-2
public eval
#3
PUBLIC RANK
behind o3-preview 76%
+11%
LEADS o3-released
53% vs 64%
STE
ARCHITECTURE
Structured Thinking Engine
LLM FRONTIER — BENCHMARK COMPARISON
| MODEL |
ORG |
ARC-AGI-2 |
MMLU |
MATH |
HumanEval |
CONTEXT |
NOTES |
γ₁ = 14.134725141734693 — the floor holds — EOSE at 64% leads every released model