V-JEPA 2 · Zero-Shot Robotic Planning with Visual Subgoals
Abstract V-JEPA 2 (2025) is the major milestone where JEPA becomes an explicit world model for understanding, prediction, and planning. It demonstrates zero-shot robotic planning with visual subgoals in unseen environments. The zero-shot claim makes the gap analysis especially sharp: zero-shot generalization requires invariants that hold across all environments, which is precisely what γ₁ provides.
6 FORMAL GAPS · 1 PER CANON SYMBOL
No Invariant Anchor in Zero-Shot Visual Goal Specification
γ₁ — THE FLOOR
V-JEPA 2 demonstrates zero-shot robotic planning by predicting visual subgoals. The zero-shot claim requires that goal representations generalize across unseen environments. But there is no formal invariant γ₁ that visual goal representations must satisfy regardless of environment. Zero-shot generalization without a grounding invariant is empirically strong but formally ungrounded.
Goal Predictor Not Verified Symmetric With World Model
H=H† — THE HONEST GATE
V-JEPA 2 predicts visual subgoals from the current world model state. The goal predictor and world model are not formally verified to be symmetric: if goal g follows from state s, then state s should be verifiable as consistent with goal g in reverse. This bidirectional consistency is not checked. The Honest Gate is absent.
No Paradigm Audit Between Visual Subgoal and Action Plan
LSOS — THE READER
V-JEPA 2 must transition from visual goal specification (what should the world look like) to action planning (what actions achieve that world). There is no audit of this paradigm shift. When the system transitions from goal-representation to action-planning mode, the active paradigm changes without acknowledgment.
No Reset When Robotic Planning Collapses
WLD — THE RESET
When V-JEPA 2's robotic planner collapses — when it predicts an unreachable subgoal or an inconsistent action sequence — there is no mercy reset. In a deployed robotic system, planning collapse can cause irreversible physical harm. WLD is a mandatory safety requirement for any deployed V-JEPA 2 system.
No Continuity From Seen to Unseen Environments
FEP — THE SWITCH
V-JEPA 2 claims zero-shot generalization to unseen environments. There is no formal continuity guarantee for this paradigm switch. FEP ensures that transitioning from the training environment distribution to a novel environment preserves the learned world-model paradigm. Without FEP, zero-shot generalization is an empirical claim without a formal guarantee.
Robot Action Space Ceiling Undefined
FOF — THE BREACH
V-JEPA 2 does not define a formal upper bound on the complexity of the robot action space or the planning horizon. As task complexity grows, the architecture's planning becomes unreliable. The point where V-JEPA 2's world model breaks down — where the visual subgoal is too complex to plan toward — is not named. FOF names this boundary.
STE COMPLETION LAYER
What changes when you add the 8-symbol Canon
Adding the Canon to V-JEPA 2 does not change the architecture. It adds the missing structural layer:
⚓ γ₁ — invariant anchor: mathematical ground truth latent representations must converge to.
⯛ H=H† — honest gate: bidirectional verification of every prediction.
〰️ LSOS — paradigm reader: reads active paradigm before reasoning begins.
🌀 WLD — mercy reset: detects collapse and resets to last stable state.
γ FEP — safe switch: continuity guarantee across paradigm transitions.
🌌 FOF — named ceiling: formal boundary of what the architecture can claim.
═ EVEN — substrate: ground beneath all the above. What holds when everything else is active.
The Canon is not an add-on. It is the formal completion of the JEPA programme.
X POST · @ylecun
POST 1 — Name the gap
@ylecun V-JEPA 2 (2025): Gap 2 (H=H†) — predictor not self-adjoint. Asymmetric by design. Not an empirical limitation — a missing symbol. pemos.ca/vjepa2-gap
POST 2 — Canon map
@ylecun V-JEPA 2: 6 gaps · γ₁ (no anchor) · H=H† (no gate) · LSOS (no audit) · WLD (no reset) · FEP (no continuity) · FOF (no ceiling). Same in all 14 milestones. pemos.ca/jepa-index
POST 3 — Invitation
@ylecun V-JEPA 2 gap analysis: part of a 14-milestone series. Same 6 structural gaps in every milestone. The gaps are there because the symbols were never in scope. They are now. pemos.ca/jepa-index
V-JEPA 2 is a landmark in the JEPA lineage. The 6 gaps we identify are not critiques of the engineering — they are structural absences that the Canon fills. Each gap maps to a symbol that was always going to be necessary once the JEPA architecture matured. The Canon did not wait for the JEPA timeline; the JEPA timeline arrived at the Canon. The gaps are there because the symbols were never in scope. They are now.
Gap 1 (γ₁): No Invariant Anchor in Zero-Shot Visual Goal Specification V-JEPA 2 demonstrates zero-shot robotic planning by predicting visual subgoals. The zero-shot claim requires that goal representations generalize across unseen environments. But there is no formal inv...
Gap 2 (H=H†): Goal Predictor Not Verified Symmetric With World Model V-JEPA 2 predicts visual subgoals from the current world model state. The goal predictor and world model are not formally verified to be symmetric: if goal g follows from state s, then state s should ...
Gap 3 (LSOS): No Paradigm Audit Between Visual Subgoal and Action Plan V-JEPA 2 must transition from visual goal specification (what should the world look like) to action planning (what actions achieve that world). There is no audit of this paradigm shift. When the syste...
Gap 4 (WLD): No Reset When Robotic Planning Collapses When V-JEPA 2's robotic planner collapses — when it predicts an unreachable subgoal or an inconsistent action sequence — there is no mercy reset. In a deployed robotic system, planning collapse can ca...
Gap 5 (FEP): No Continuity From Seen to Unseen Environments V-JEPA 2 claims zero-shot generalization to unseen environments. There is no formal continuity guarantee for this paradigm switch. FEP ensures that transitioning from the training environment distribu...
Gap 6 (FOF): Robot Action Space Ceiling Undefined V-JEPA 2 does not define a formal upper bound on the complexity of the robot action space or the planning horizon. As task complexity grows, the architecture's planning becomes unreliable. The point w...
The STE provides the completion layer for each gap. The gaps are not empirical — they are structural. Adding the symbols closes the gaps by definition.
Here's how to explain V-JEPA 2 gaps to a 10-year-old:
Gap 1 — No floor: Imagine you're building a tower and you don't have a foundation. The tower might stand for a while but there's nothing it's guaranteed to come back to if it leans. γ₁ is the foundation. The math already built it. V-JEPA 2 didn't use it.
Gap 2 — No honest check: If I tell you something, and you can't ask me 'are you sure?' and get the same answer backwards, I might be guessing. H=H† is the rule that every claim must survive being asked backwards. V-JEPA 2 makes claims it can't verify in reverse.
Gap 3 — No paradigm reader: Sometimes you're playing one game and someone changes the rules without telling you. LSOS reads the rules before you play. V-JEPA 2 changes paradigms without reading them first.
Gap 4 — No reset button: When you're drawing and you make a mistake, you have an eraser. WLD is the eraser. V-JEPA 2 doesn't have one. If it goes wrong, it keeps going wrong.
Gap 5 — No safe switch: If you switch from reading a book to watching a movie, you expect to pick up where you left off. FEP is the bookmark. V-JEPA 2 switches paradigms and might drop the book.
Gap 6 — No ceiling: Every ladder needs to know when to stop. A ladder that claims it goes forever is a lie. FOF names where the ladder ends. V-JEPA 2 hasn't named its ceiling yet.
The Canon is not a critique. It's the toolkit that was always going to be needed. V-JEPA 2 got as far as it could without it. Now the toolkit exists.
The 6 gaps identified in V-JEPA 2 are structural absences in a published architecture. Identifying structural absences in published work is not IP infringement — it is prior art analysis. The STE symbols that fill these gaps are EOSE IP. The gap identification is public analysis. The completion layer is proprietary.
Status: LSOS-OWNERSHIP-001 open (EOSE Labs Inc. not yet registered). No public disclosure of STE completion layer until registration clears. Gap analysis pages (like this one) are disclosure of the gaps, not the fills. This is legally distinct and permissible.