⚓ K8S OSS SOVEREIGNTY

ARB-596 · The complexity is the moat · physicsengines.apps.eose.ca = CRD-001 · EOSE extended the API

KUBERNETES IS OSS FOR A REASON

The complexity is deliberate. It is the moat. Most organizations run managed K8s and never touch the API. EOSE extends it. We wrote our own CRDs. The platform understands our domain.

physicsengines.apps.eose.ca — WE EXTENDED THE API
116
CRDs LIVE
3
TRIOS
10
LAYERS
6/6
FULL STANDARD
MDSMS ARTIFACTS

☁️ CLOUD — AKS

aks-eose-aaas-dev — 116 CRDs, full operator suite
kantai-cc + kantai-ce — multi-cluster, gangway portal
Managed control plane — Azure handles etcd, API server HA
T4/H100 GPU nodes — cloud burst when local is full
Global reach — pemos.ca, eose.ca, deseof.com all served from AKS ingress

🏠 ON-PREM — k3s LAN

🚧
WAVE 1 — STORAGE VAULT TRIO — forge + pcdev + NAS, k3s server next
🚧
WAVE 2 — LOCAL TRIO — msi01 + msclo + eose-dev
RTX 4080/4090/5090 GPUs ×5 — local compute always faster than cloud
💾
Longhorn storage — NVMe distributed block across LAN nodes
🔒
Zero cloud egress cost — LAN traffic stays on LAN

THE 10-LAYER K8S SOVEREIGNTY STACK

01
IDENTITY & PKI ⬡ H=H†
Every workload gets a cryptographic identity. cert-manager issues, Tailscale overlays, Istio enforces mTLS. Nobody speaks without proof.
cert-manager ✅Tailscale ✅Istio security ✅ SPIREKyverno
02
NETWORK & MESH 〰 LSOS + γ FEP
Calico enforces network policy at L3. Istio enforces it at L7. ExternalDNS keeps DNS truthful. Every packet flows through LSOS before reaching a service.
Calico/Tigera ✅Istio ✅ExternalDNS ✅ MetalLBGateway APISubmariner
03
GITOPS & DELIVERY ⚓ γ₁
Flux watches eose-sre/openclaw-fleet. Every change to Git propagates to the cluster. Image automation closes the loop — new tag in registry → commit back to Git → deploy.
Flux ✅ (all 5 controllers)Argo RolloutsArgo Workflows
04
🌀
SECRETS & CONFIG 🌀 WLD
External Secrets Operator pulls from Azure KV, AWS SM, GCP SM. Secrets Store CSI mounts them as volumes. No secrets in Git. No hardcoded credentials. WLD resets the credential state cleanly.
ESO ✅Secrets Store CSI ✅Vault OperatorReloader
05
OBSERVABILITY 〰 LSOS
Prometheus scrapes everything. Kiali reads the service mesh graph. AlertManager fires when the floor wobbles. LSOS doesn't just read — it fires when something isn't right.
Prometheus Operator ✅Kiali ✅ OpenTelemetry OpLokiGrafana Op
06
🌀
STORAGE & DATA 🌀 WLD
VolumeSnapshot CRDs capture PV state. CloudNativePG runs HA Postgres. Redis HA for state. Velero backs up to NAS/S3. If the floor drops — WLD restores it.
Volume Snapshots ✅Velero ❌ P0 — NO BACKUP LonghornCloudNativePG
07
🌌
CHAOS & RESILIENCE 🌌 FOF
Chaos Mesh injects failure on purpose. Pod kill. Network partition. CPU stress. Time skew. FOF is the breach — controlled. γ₁ holds or it doesn't. Now you know before production finds out.
Chaos Mesh ❌ NOT INSTALLEDLitmusChaosSteadybit
08
γ
AUTOSCALING γ FEP
KEDA scales on event depth — queue length, Kafka lag, custom metrics. VPA right-sizes pod resources. FEP switches paradigm based on load signal.
KEDA ❌ NOT INSTALLEDVPADeschedulerVolcano
09
POLICY & COMPLIANCE ⬡ H=H†
OPA/Gatekeeper enforces policy at admission. Kyverno mutates and validates. Falco detects runtime anomalies. Trivy scans running images. H=H† is not just identity — it's enforcement.
OPA/GatekeeperKyvernoFalcoTrivy Op
10
⚓🌌
AI/ML WORKLOADS ⚓ γ₁ · 🌌 FOF
physicsengines.apps.eose.ca is our CRD. We extended the K8s API. GPU Operator manages RTX nodes. KServe serves models. Ollama Operator (TO BUILD) deploys models as CRDs.
physicsengines.apps.eose.ca ★ GPU Operator ❌ NEEDEDKServeOllama Op

WHY THE COMPLEXITY IS THE MOAT

Most teams treat K8s complexity as a cost. They reach for managed platforms, serverless, PaaS. They trade control for convenience. The moat disappears with it.

EOSE treats complexity as capital. Every CRD we understand is a pattern we own. Every operator we deploy is infrastructure we control. The fleet runs the same spec cloud or on-prem — same CRDs, same operators, same patterns. The platform goes where the work goes.

"Kubernetes is OSS for a reason. It is deliberately complex.
That complexity is not a bug — it is the moat.
We understand it. We run it. We extend it.
We wrote CRD-001. The API belongs to us now.
physicsengines.apps.eose.ca — that is sovereignty."

HOSTING OPTIONS — THE FULL MAP

AKS (Azure)
Managed control plane. D2s_v5 nodes. 116 CRDs live. canadacentral. Full operator suite. Istio, Flux, ESO, Calico all running.
✅ LIVE
k3s (LAN)
Single binary. 512MB overhead. Runs on WSL2 or native Linux. Same K8s API as AKS. forge = server, all others = agents. One command install.
k3d
k3s inside Docker. Spin up/destroy in seconds. Dev + CI cluster testing. Test new operators before pushing to AKS.
MicroK8s
Ubuntu snap-based. Single-node. Good for msclo/forge native Linux. Snap-based addon system.
FUTURE OPTION
Rancher (SUSE)
Multi-cluster dashboard. Manages AKS + local k3s + any cluster from one UI. Lifecycle + fleet management.
Talos OS
Immutable OS for K8s nodes. API-only management. No SSH, no shell. Future sovereign silo OS for bare metal.
FUTURE — SOVEREIGN

OWN THE PLATFORM. OWN THE STACK.

Same spec. Cloud or on-prem. AKS or k3s. The CRDs are the same. The operators are the same. The patterns are the same. The platform follows the work — not the other way around.