COST OVERVIEW
API + CLOUD SPEND · EOSE FLEET · MAY 2026
⚠ ANTHROPIC BURN ACCELERATING
Apr 28: $282.50 → Apr 29: $294.16 → Apr 30: $305.17 → May 2: $351.99
Trend: +$23/day avg. At current rate: $10,560/month on Anthropic alone.
Auto-reload firing 6-7x per day — each $11-12 chunk = ~40K tokens at Sonnet 4.6 pricing.
Apr 28: $282.50 → Apr 29: $294.16 → Apr 30: $305.17 → May 2: $351.99
Trend: +$23/day avg. At current rate: $10,560/month on Anthropic alone.
Auto-reload firing 6-7x per day — each $11-12 chunk = ~40K tokens at Sonnet 4.6 pricing.
ANTHROPIC TODAY
$351.99
May 2 · 7 auto-reloads
⇧ +15.3% vs Apr 30
AZURE / AKS (APR)
$7,332
Day 19 tracking · ~$11.6K/mo
⇧ vs $10,552 March
TOTAL API (4-DAY)
$1,234
Apr 28–May 2 · Anthropic only
⇧ +$308/day avg
LOCAL FLEET SAVE
~$180/day
Qwen default · vs all-Claude
⇣ Routing working
ANTHROPIC INVOICE HISTORY (VISIBLE)
| DATE | TYPE | AMOUNT | DAILY TOTAL | STATUS |
|---|---|---|---|---|
| 2026-05-02 | Auto-reload (x7) | $351.99 | $351.99 | HIGHEST DAY |
| 2026-04-30 | Auto-reload (x3) | $305.17 | $305.17 | WATCH |
| 2026-04-29 | Auto-reload (x2) | $294.16 | $294.16 | WATCH |
| 2026-04-28 | Single load | $282.50 | $282.50 | BASELINE |
MONTHLY PROJECTION
AT CURRENT RATE
$10,560
$352/day x 30 · Anthropic
AZURE AKS (PROJ)
$11,600
Apr actuals · May tracking
COMBINED MAY PROJ
$22,160
Anthropic + Azure · excl GCP/AWS
▶ LOCAL FLEET ROUTING (V12)
Default model: mal/qwen2.5:32b (local forge) → Anthropic only on fallback/reasoning tasks.
Compaction: safeguard mode active. Session context trimmed before token overflow.
Estimated local save: ~$5,400/month vs all-Claude routing.
Default model: mal/qwen2.5:32b (local forge) → Anthropic only on fallback/reasoning tasks.
Compaction: safeguard mode active. Session context trimmed before token overflow.
Estimated local save: ~$5,400/month vs all-Claude routing.
DAILY BURN RATE
ANTHROPIC · DAY-BY-DAY · AUTO-RELOAD ANALYSIS
DAILY BREAKDOWN
AUTO-RELOAD ANATOMY
Each $282.50 = large reload (threshold hit, auto-reloads to $300).
Each $11–12 = small top-up (balance dropped to $5, reloaded to $15).
May 2 had 7 small reloads + 1 large = balance hit floor 7 times in one day.
That's ~7 x $10 burn sessions = heavy reasoning task day (subagents, GOAT pages, CRM build).
Each $11–12 = small top-up (balance dropped to $5, reloaded to $15).
May 2 had 7 small reloads + 1 large = balance hit floor 7 times in one day.
That's ~7 x $10 burn sessions = heavy reasoning task day (subagents, GOAT pages, CRM build).
| DATE | LARGE RELOADS | SMALL RELOADS | EST TOKENS | DAILY $ | DELTA |
|---|---|---|---|---|---|
| Apr 28 | 1 | 0 | ~950K | $282.50 | baseline |
| Apr 29 | 1 | 1 | ~980K | $294.16 | +$11.66 |
| Apr 30 | 1 | 2 | ~1.02M | $305.17 | +$11.01 |
| May 1 | No invoice data (possibly lower activity day) | — | |||
| May 2 | 1 | 6 | ~1.17M | $351.99 | +$46.82 |
WHAT DROVE MAY 2 SPIKE
High-cost operations on May 2:
· CRM build (Bonsai Helix Coaster V12) — subagent spawn, multiple large writes
· 4 GOAT AP pages deployed — full story prose generation
· CLO CRM completion — 82KB single-file HTML generation
· Repo-CLO harness runs (x2) — GitHub API + galaxy rebuild
· This cost CRM itself — current session
These are all one-time build tasks. Daily operational rate should return to $280–300 range once the build sprint ends.
· CRM build (Bonsai Helix Coaster V12) — subagent spawn, multiple large writes
· 4 GOAT AP pages deployed — full story prose generation
· CLO CRM completion — 82KB single-file HTML generation
· Repo-CLO harness runs (x2) — GitHub API + galaxy rebuild
· This cost CRM itself — current session
These are all one-time build tasks. Daily operational rate should return to $280–300 range once the build sprint ends.
PROVIDER BREAKDOWN
ALL API PROVIDERS · EOSE FLEET · COST + ROUTING
ANTHROPIC
$308/day
Claude Sonnet 4.6 · auto-reload active
Key: sk-ant-api03-XppeLO...
Model: claude-sonnet-4-6
Route: fallback only (primary = local Qwen)
Pricing: ~$3/$15 per M in/out tokens
Monthly proj: $9,240–$10,560
Model: claude-sonnet-4-6
Route: fallback only (primary = local Qwen)
Pricing: ~$3/$15 per M in/out tokens
Monthly proj: $9,240–$10,560
72% of total API spend
AZURE / AKS
$387/day
AKS aks-eose-aaas-dev · Canada East
Cluster: aks-eose-aaas-dev
RG: rg-eose-aks-dev
Apr cost: CA$7,332 (day 19)
GPU pools: H100 / T4 / Adelic (all 0 count ✅)
Monthly proj: CA$11,600
RG: rg-eose-aks-dev
Apr cost: CA$7,332 (day 19)
GPU pools: H100 / T4 / Adelic (all 0 count ✅)
Monthly proj: CA$11,600
28% of total API spend
GCP (ZERO-DR / KRSRHONE)
~$45/day
northamerica-northeast1 · T4/A100
ARC runners, GPU workloads
AWS (CATHEDRAL / JAYRHONE)
~$38/day
ca-central-1 / us-east-2 · V100/A10G
ARC runners, batch inference
LOCAL FLEET (FREE)
$0/day
forge/msclo/yone/pcdev · owned hardware
Primary inference path · saves ~$180/day
LOCAL MODELS ACTIVE (NO COST)
| SILO | GPU | MODELS | DAILY EQUIV SAVE | STATUS |
|---|---|---|---|---|
| forge (192.168.2.12) | RTX 4090 24GB | qwen3:14b/8b, qwen2.5:32b, deepseek-r1:32b, qwq:32b | ~$80/day | LIVE |
| msclo (192.168.2.19) | RTX 5090 24GB | qwen2.5:32b, nomic-embed | ~$55/day | LIVE |
| pcdev (192.168.2.16) | RTX 3090 24GB | qwen2.5:72b/32b/14b, llama3.1:8b | ~$45/day | LIVE |
| yone | RTX 5080 16GB | qwen3:14b/8b, deepseek-r1:7b, nomic-embed | ~$30/day | LIVE |
COST FORECAST ENGINE
MONTHLY PROJECTION · SCENARIO MODELLING · REDUCTION TARGETS
RUN FORECAST
Daily Anthropic $
Daily Azure $
Daily GCP $
Daily AWS $
Months ahead
Reduction target %
REDUCTION LEVERS
| LEVER | MONTHLY SAVE | EFFORT | STATUS |
|---|---|---|---|
| Local-first routing (already done) | ~$5,400 | ACTIVE | DONE |
| Compaction safeguard (already done) | ~$800 | ACTIVE | DONE |
| Qwen3.6 native Windows (pcdev/forge) | ~$1,200 | LOW | TEST |
| Anthropic batch API for non-urgent tasks | ~$2,100 | MEDIUM | PLANNED |
| AKS scale-to-zero on unused namespaces | ~$1,800 | MEDIUM | PLANNED |
| Caching layer for repeated PEMCLAU queries | ~$900 | MEDIUM | PLANNED |
V12 DEBT ENGINE
API SPEND AS SOVEREIGN DEBT · γ₁ = 14.134725141734693 · REDUCTION PROTOCOL
V12 DEBT DOCTRINE
API spend is sovereign debt. Every dollar spent on external inference is a dollar that could fund local compute. The floor is γ₁ = 14.134725141734693 — the cost anchor. All reduction targets are measured relative to the baseline day (Apr 28: $282.50).
API spend is sovereign debt. Every dollar spent on external inference is a dollar that could fund local compute. The floor is γ₁ = 14.134725141734693 — the cost anchor. All reduction targets are measured relative to the baseline day (Apr 28: $282.50).
DEBT FLOOR (γ₁ ANCHOR)
$282.50
Apr 28 baseline · minimum observed
CURRENT DEBT RATE
$351.99/day
May 2 · 24.6% above floor
TARGET (−30%)
$197.75/day
Local routing + batch + cache
DEBT REDUCTION PROTOCOL
| # | ACTION | CURRENT | TARGET | SAVE/DAY | STATUS |
|---|---|---|---|---|---|
| 1 | Local-first routing (Qwen default) | $490/day equiv | $308/day | $182 | DONE |
| 2 | Compaction safeguard | +$30/day equiv | trimmed | $30 | DONE |
| 3 | Qwen3.6 vLLM native (pcdev+forge) | Ollama 64 tok/s | vLLM 80+ tok/s | $40 | TESTING |
| 4 | Anthropic batch API (non-urgent) | $308/day | $207/day | $101 | PLANNED |
| 5 | PEMCLAU query caching (Redis) | repeated queries | cache hit 40% | $30 | PLANNED |
| 6 | AKS idle namespace scale-to-zero | $387/day | $327/day | $60 | PLANNED |
MONTHLY DEBT TRAJECTORY
WITHOUT REDUCTIONS
$21,897/mo
Anthropic $10,560 + Azure $11,337
WITH ALL LEVERS (TARGET)
$12,840/mo
-$9,057/mo · -41% · local fleet carrying load
COST HELIX
SPEND FLOW · γ₁ = 14.134725141734693 · PROVIDER ORBITS
ANTHROPIC
AZURE
GCP
AWS
LOCAL (FREE)