| Role | Cluster | Sub / Owner | Region | Nodepools | VM SKU | Nodes | Spot |
|---|---|---|---|---|---|---|---|
| Primary / pemos.ca | aks-eose-aaas-dev | 427873ee | canadacentral | system + agents | D2s_v5 | 4 | SPOT |
| CLO + GPU rig | aks-eose-clo-gpu | 440a5792 / Amani | canadacentral | system (D2s_v3) + gpu (T4) | D2s_v3 / NC6s_v3 | 2+0 | GPU → 0 |
| Master 1 | aks-master1 | 440a5792 / Amani | canadacentral | system | B2s_v2 | 1 | REGULAR |
| Master | aks-master | 440a5792 / Amani | canadacentral | system | B2s_v2 | 1 | REGULAR |
| Deseof sovereign | aks-eose-deseof | 239915fb / eose | canadacentral | system + agents | B2s | 2 | TO CREATE |
⚡ FLEET LAW: These 3 clusters form a single logical compute plane. No cluster is indispensable.
⚡ FLEET LAW: The deseof cluster (239915fb) must be created before V11 chaos drill month-1.
| Mechanism | Detail | SLA |
|---|---|---|
| Graceful drain | Azure spot eviction notice → 30s grace period → pod checkpoints state → terminates cleanly | 30s |
| HVCP reroute | HVCP detects node loss, scores remaining 2 clusters, reroutes traffic to survivor | ≤ 30s |
| PodDisruptionBudgets | All critical workloads: minAvailable=1. No workload loses last replica without migration. | Mandatory |
| Checkpoint save | Stateful pods write checkpoint to Azure Blob / Qdrant before termination | Pre-drain |
| Spot discount | User nodepools on spot VMs — D2s_v5 spot vs regular | 60–70% saving |
⚡ FLEET LAW (ARB1-B2): All user nodepools MUST use spot/preemptible VMs. System nodepools exempt. No waiver.
Three test categories. All must pass before month-end. Quarterly: full 3-cluster migration.
kubectl drain <node> --ignore-daemonsets --delete-emptydir-data
kubectl delete pod -n ingress-nginx -l app.kubernetes.io/component=controller
HVCP selects target → Flux kustomization patch → verify → migrate back
curl ts-lianli01-ollama:11434/api/tags && curl ts-lounge-ollama:11434/api/tags && curl ts-msclo-ollama:11434/api/tags
| Service | Primary NS | Backup / Replication | Tech | Notes |
|---|---|---|---|---|
| Vector DB | qdrant | Replicated cross-cluster | Qdrant | pemclau-v11 · 10,435 vectors |
| Cache | redis | Replicated | Redis | Session + inference cache |
| Relational | postgres | Backup RGs | PostgreSQL | CRUD + event store |
| Graph | neo4j | — | Graph DB | PEMCLAU knowledge graph |
| Secrets | external-secrets | — | Azure Key Vault | ESO sync, all clusters |
| Object Store | blob | Backup RGs | Azure Storage | Checkpoints, artifacts |
Intent Routing Engine — classifies and routes all inbound requests across the fleet
Sovereign gateway layer — controls cross-cluster traffic, HVCP integration point
Chief Legal Officer inference wrapper — privacy-preserving CLO context layer
Orchestration & Node Binding Architecture — fleet node registration + health
Sovereign corporate framework — identity, compliance, billing governance
6 symbols: γ₁⚓ H=H†⬡ LSOS〰️ WLD🌀 FEP γ FOF🌌 — constitutional ground truth
| Workload Type | Migration Time | Data Risk | Trigger |
|---|---|---|---|
| Stateless (portal, API) | ~30s | None | Auto — HVCP score |
| Stateful (Qdrant, Redis) | ~120s | PVC reattach | Manual approval |
| Mail (Haraka) | ~60s | Queue drain first | Manual + drain gate |
| CLO workloads | ~90s | None | Auto — CLO cluster always target |
| Cluster | Sub | Current | Target (spot) | Saving | Action |
|---|---|---|---|---|---|
| aks-eose-aaas-dev (primary) | 427873ee | ~$140/mo | ~$80/mo | $60 | Convert agents to spot |
| aks-eose-clo-gpu | 440a5792 | ~$70/mo | ~$40/mo | $30 | GPU stays @ 0, system → spot |
| aks-master1 + aks-master | 440a5792 | ~$60/mo | ~$40/mo | $20 | Downsize to B2s spot |
| aks-eose-deseof (new) | 239915fb | $0 | ~$25/mo | -$25 | Create — B2s spot |
| aks-kantai-eose-dev (B4ms) | 458e8558 | ~$110/mo | $0 | $110 | CONFIRM empty → STOP |
| aks-kantai-eose-ce (meek-mail) | 458e8558 | ~$50/mo | ~$50/mo | $0 | KEEP — Haraka live |
| TOTAL | ~$430/mo | ~$235/mo | $195 saved |