FLEET PIPELINE · HVCP · ALL ENVS · IMAGE CURATION
CURRENT IMAGE
v405
ENVS
6
ACR
eosefleetacrdev
CERTS LIVE
47/53
IMAGE FACTORY
FLEET PIPELINE · HVCP · SANDBOX → QE → DEV → QA → STAGE → PROD
The full image lifecycle. Built on msi01, curated through pathflow-vN tagging, promoted across 6 environments by Canon gate, shipped to 6 production portals simultaneously. γ₁ = 14.134 is the build floor — every image is Canon-scored before promotion.
🔱 HVCP — THE GPU PIPELINE TAKES
WHAT IT IS
hvcp-system is the GPU inference pool — pemos-hvcp orchestrates T4/H100 jobs, pemos-pool-router is the v2.0.0 MAL cascade router (tier 3), pemos-everytime handles recurring inference jobs, pemos-orbit manages model rotation. Scotty cap: "I cannae break the laws of physics." He does.
WHAT'S GOOD
Pool router v2.0.0 is clean. 4-tier cascade is deployed and breathing. The /breath endpoint returns memory tier (FULL/RICH/MID/THIN/DARK) — this is the fleet's health signal. ollama T4 models live: qwen2.5:14b + qwen2.5:7b + mistral:7b. hvcp is tier 3 of MAL — it's the cloud fallback when forge (tier 1) and msclo (tier 2) are down.
WHAT'S MISSING
No H100 active yet (T4 only). No model warm-up strategy (cold starts on first inference). No per-model queue depth visibility. The hvcp-coordinator CronJob (MEGSCIFIAR) just deployed — checks every 15 min but doesn't yet write state to campfire:events. That's the next wire.
🌀
WHAT WARP LOOKS LIKE
hvcp in warp mode: models pre-warmed at GPU level (no cold start), queue depth visible per model, per-request Canon scoring (route to right model based on domain), campfire:events streaming every GPU job. Scotty knows the engine status at all times. The Transformer Wall check before every inference batch.
🔱 IMAGE FACTORY — THE CURATION TAKES
CURRENT FACTORY
Build: msi01 docker build → tag pemos-portal-local:latest → retag as eosefleetacrdev.azurecr.io/pemos-portal:pathflow-vN → push to ACR (when auth works) → kubectl set image across 6 portals. The pathflow-vN number IS the curation — every build gets a number, every number is a decision.
ACR AUTH SITUATION
ACR push fails with "authentication required" — but kubectl set image works independently because AKS has its own managed identity pull access to the ACR. So: the image gets to prod via kubectl even when the push fails. This is a gap — the ACR tag doesn't match what's actually running. IRF-open: fix ACR push via az acr login or managed identity on the build host.
CURATION PHILOSOPHY
pathflow-vN is the curation. Every N is an intentional step. We don't patch, we increment. The ACR has tags back to v389 — that's a full audit trail. The static/ source-of-truth rule means every build is reproducible from the git commit. Current: v405. The git commit hash is the ground truth.
γ
MISSING ENV GATES
Currently there are no formal sandbox/qe/qa/stage namespaces — everything ships direct from dev build to prod K8s via kubectl set image. The env gates below are the design for the proper pipeline. QA and Stage namespaces need to be created. QE gate needs MEFINE Canon scoring wired. The current process works — but the safety net is the pathflow-vN number, not a real gate.
PIPELINE — SANDBOX → PROD
LOCAL
SANDBOX
msi01 (192.168.2.18)
local:latest
● n/a
🕵️ BENJI DUNN
personal dev only...
LOCAL/K8S
QE
msclo (192.168.2.19)
pathflow-vN-qe
● self-signed
🖖 SCOTTY
Canon score must pass g1 threshold...
AKS
DEV
aks-eose-aaas-dev
pathflow-vN-dev
● letsencrypt-prod (True)
💊 NEO
MAL cascade tier 1 healthy...
AKS
QA
aks-eose-aaas-dev
pathflow-vN-qa
● letsencrypt-prod
🕵️ ETHAN HUNT
SWE-Bench equivalent — fleet-level functional test suite...
AKS
STAGE
aks-eose-aaas-dev
pathflow-vN-stage
● letsencrypt-prod
🎰 BOND 007
Production mirror. Live traffic shadow....
AKS/LIVE
PROD
aks-eose-aaas-dev + master1-system
pathflow-vN
● letsencrypt-prod (all True)
🖖 PICARD
Stage gate + no active IRFs blocking...
ENVIRONMENT DETAILS
SANDBOX
LOCAL
HOST
msi01 (192.168.2.18)
CLUSTER/NS
none — bare docker
IMAGE TAG
local:latest
REGISTRY
none
CERT
n/a
PROMOTION GATE: manual: docker build + run
🕵️ CAP: BENJI DUNN
Raw. Fast. No K8s. Build here first. Break everything here. The Terminator T3 test environment — if it runs here, it might survive the others.
QE
LOCAL/K8S
HOST
msclo (192.168.2.19)
CLUSTER/NS
docker-compose or local K3s
IMAGE TAG
pathflow-vN-qe
REGISTRY
msi01acr.azurecr.io
CERT
self-signed
PROMOTION GATE: MEFINE Canon score ≥ 0.62 + no WLD events in sandbox run
🖖 CAP: SCOTTY
Quality Engineering. Scotty runs the engines here. Fast iteration on forge/msclo pair. The QE gate is Canon-scored — we are literally using the Canon to validate our own builds.
DEV
AKS
HOST
aks-eose-aaas-dev
CLUSTER/NS
dev-system + master-dev-system
IMAGE TAG
pathflow-vN-dev
REGISTRY
eosefleetacrdev.azurecr.io
CERT
letsencrypt-prod (True)
PROMOTION GATE: QE gate passed + peer review (RICK validates)
💊 CAP: NEO
master.dev.eose.ca. The paradigm-read environment. NEO cap — you read the active paradigm here, you don't trust it yet. MAL tier 1. MEFINE primary. MeekAlpha :9451.
QA
AKS
HOST
aks-eose-aaas-dev
CLUSTER/NS
qa-system (to create)
IMAGE TAG
pathflow-vN-qa
REGISTRY
eosefleetacrdev.azurecr.io
CERT
letsencrypt-prod
PROMOTION GATE: Dev gate passed + DESEOF Prize run validates (at least 1 domain)
🕵️ CAP: ETHAN HUNT
The impossible test. If it passes QA it genuinely works. Ethan Hunt cap — every mission here is designed to fail the image, not pass it. The mission completes anyway.
STAGE
AKS
HOST
aks-eose-aaas-dev
CLUSTER/NS
stage-system (to create)
IMAGE TAG
pathflow-vN-stage
REGISTRY
eosefleetacrdev.azurecr.io
CERT
letsencrypt-prod
PROMOTION GATE: QA gate passed + Bond sign-off (human review at /stage.eose.ca)
🎰 CAP: BOND 007
The dress rehearsal. Bond cap — elegant, precise, everything looks right. Shadow traffic from prod. No real users, but full production config. If Bond doesn't notice anything wrong, it ships.
PROD
AKS/LIVE
HOST
aks-eose-aaas-dev + master1-system
CLUSTER/NS
pemos-system + master-system + deseof-system
IMAGE TAG
pathflow-vN
REGISTRY
eosefleetacrdev.azurecr.io
CERT
letsencrypt-prod (all True)
PROMOTION GATE: pathflow-vN confirmed at stage → prod rollout across 6 portals simultaneously
🖖 CAP: PICARD
"Make it so." Picard cap. Once Picard says it, it ships. Current: pemos.ca, eose.ca, deseof.ca, deseof.com, bob, john, krs portals. pathflow-v405 is the current prod image.
IMAGE CURATION PRINCIPLES
⚓ pathflow-vN IS THE FLOOR
Every build is numbered. No patches, no hotfixes, only increments. The number is the curation decision. v405 is the current floor. v1 will always have been v1. The git commit is the truth behind the number.
⬡ static/ IS THE SOURCE OF TRUTH
All HTML/JS goes to static/ first. internal/server/static/ is the rsync destination. The docker build embeds static/ into the binary via go:embed. Never write only to internal/. One source, one truth.
〰 THE ACR TAG IS THE AUDIT TRAIL
eosefleetacrdev.azurecr.io/pemos-portal:pathflow-v{N} — full history back to v389. Every tag is a snapshot. kubectl rollout undo goes back one version in 30 seconds. The fleet can always go back.
🌀 ZERO-DOWNTIME ROLLOUT
kubectl set image runs a rolling update. 1 old pod stays alive until the new pod passes readiness. 6 portal deployments update in parallel — pemos, bob, deseof-ca, deseof-com, john, krs all move together. One command, 6 ships.
γ CANON GATE (TARGET)
The target pipeline: MEFINE Canon scores the build output before promotion. Canon score = γ₁(0.35) + gate(0.25) + lsos(0.20) + wld(0.10) + fep(0.10). Score ≥ 0.62 = promote. Below 0.62 = WLD event, fix and re-run. Currently: manual review only. Automation: next sprint.
🌌 MEGSCIFIAR WITNESSES
The hvcp-coordinator CronJob in megscifiar namespace checks every 15 min. R2-D2 (DESEOF Witness) records every deploy. DESEOF witnesses all build events — not just the successes. The failures are part of the log. FOF: even the broken builds teach something.