DEBT BONIXER · TWO DEBT TYPES UNIFIED
Two debt types in the fleet: Tech Debt (sorries, open TRBs, undeployed LABRs, broken KCFs) and Cloud Cost Debt (daily Azure spend, GPU pools left running, over-provisioned nodes). The Debt Bonixer surfaces both in one view. CFO lens + CTO lens together. GravDebt engine drives the cloud cost gravity model. γ₁ floor = cost floor.
CLOUD COST DEBT · DAILY BURN
APRIL COST · CA$7,332 AT DAY 19
Tracking ~CA$11,600 vs CA$10,552 March baseline. Day 95 = May 9. Check: cloud-cost-report.sh daily 8pm EDT. Scale-down script fixed (while IFS read -r bug). GPU pools: all 0 as of last check.
H100: ~CA$15/hr/node · T4: ~CA$1.50/hr/node · NEVER auto-scale — alert only
GPU POOL WATCH · COST CRITICAL
Any NC* node pool count > 0 with no active GPU workload = alert Kay immediately. Do NOT auto-scale down. krs-portal + local-llm Pending (node mem 100%, scaled down nightly — expected).
Rule: GPU pool >0 + no workload = alert · Kay decides
TECH DEBT · PER SILO
GRAVDEBT ENGINE · COST GRAVITY
GRAVDEBT: Cloud cost has gravity — the longer a node runs at high cost, the harder it is to stop (team reliance, cached state, contract locks). GravDebt models this as a gravitational field. Nodes with high gravity = high switching cost. V3 portal surfaces GravDebt per AKS cluster as a heat map. Current: aks-eose-aaas-dev (Canada East) is the primary cost driver.