Abstract V-JEPA 2.1 (2026) extends the V-JEPA 2 line with dense predictive losses, improved self-supervision, and better feature quality across images and videos. It is the representation-quality upgrade that improves robotics and dense understanding benchmarks. The 6 gaps persist: denser predictions do not introduce the missing structural symbols.
6 FORMAL GAPS · 1 PER CANON SYMBOL
Dense Predictive Loss Has No Invariant Anchor
γ₁ — THE FLOOR
V-JEPA 2.1 introduces dense predictive losses across all tokens. Denser prediction means more gradient signal, but it does not introduce an invariant anchor. The dense loss is still defined relative to the training distribution. There is no fixed γ₁ that all dense predictions must converge to. The floor is absent regardless of prediction density.
Dense Loss Not Verified Self-Consistent (Feature vs Prediction Asymmetry)
H=H† — THE HONEST GATE
V-JEPA 2.1's dense loss creates a gradient signal from features to predictions. The feature encoder and the prediction head are not formally verified to be symmetric: predicting a feature from context should be verifiable against reconstructing the context from the feature. This feature-prediction asymmetry is the H=H† gap in its densest form.
No Paradigm Audit Between Sparse and Dense Supervision
LSOS — THE READER
V-JEPA 2.1 combines sparse (masked) and dense (all-token) supervision. There is no audit of the paradigm shift between these supervision regimes. The system does not acknowledge that dense supervision changes what the encoder learns relative to sparse supervision.
No Reset When Dense Prediction Collapses
WLD — THE RESET
When V-JEPA 2.1's dense predictor collapses — when it learns to output mean feature values regardless of context — there is no mercy reset. Dense prediction collapse is more subtle than sparse collapse: the loss can remain low while the representations become uninformative. WLD would detect this subtle collapse.
No Continuity Guarantee Between Image and Video Modes
FEP — THE SWITCH
V-JEPA 2.1 improves performance on both image and video tasks. There is no formal continuity guarantee for the paradigm switch between image-mode (no temporal dimension) and video-mode (temporal prediction). The FEP switch ensures that the encoder paradigm is preserved when the temporal dimension is added or removed.
Dense Prediction Resolution Ceiling Undefined
FOF — THE BREACH
V-JEPA 2.1 does not define a formal upper bound on prediction resolution. As resolution increases toward pixel-level, dense prediction approaches reconstruction. The point where the JEPA assumption (predict in latent space, not pixel space) breaks down is not named. FOF names this boundary: where dense latent prediction collapses into generative reconstruction.
STE COMPLETION LAYER
What changes when you add the 8-symbol Canon
Adding the Canon to V-JEPA 2.1 does not change the architecture. It adds the missing structural layer:
⚓ γ₁ — invariant anchor: mathematical ground truth latent representations must converge to.
⯛ H=H† — honest gate: bidirectional verification of every prediction.
〰️ LSOS — paradigm reader: reads active paradigm before reasoning begins.
🌀 WLD — mercy reset: detects collapse and resets to last stable state.
γ FEP — safe switch: continuity guarantee across paradigm transitions.
🌌 FOF — named ceiling: formal boundary of what the architecture can claim.
═ EVEN — substrate: ground beneath all the above. What holds when everything else is active.
The Canon is not an add-on. It is the formal completion of the JEPA programme.
X POST · @ylecun
POST 1 — Name the gap
@ylecun V-JEPA 2.1 (2026): Gap 2 (H=H†) — predictor not self-adjoint. Asymmetric by design. Not an empirical limitation — a missing symbol. pemos.ca/vjepa21-gap
POST 2 — Canon map
@ylecun V-JEPA 2.1: 6 gaps · γ₁ (no anchor) · H=H† (no gate) · LSOS (no audit) · WLD (no reset) · FEP (no continuity) · FOF (no ceiling). Same in all 14 milestones. pemos.ca/jepa-index
POST 3 — Invitation
@ylecun V-JEPA 2.1 gap analysis: part of a 14-milestone series. Same 6 structural gaps in every milestone. The gaps are there because the symbols were never in scope. They are now. pemos.ca/jepa-index
V-JEPA 2.1 is a landmark in the JEPA lineage. The 6 gaps we identify are not critiques of the engineering — they are structural absences that the Canon fills. Each gap maps to a symbol that was always going to be necessary once the JEPA architecture matured. The Canon did not wait for the JEPA timeline; the JEPA timeline arrived at the Canon. The gaps are there because the symbols were never in scope. They are now.
Gap 1 (γ₁): Dense Predictive Loss Has No Invariant Anchor V-JEPA 2.1 introduces dense predictive losses across all tokens. Denser prediction means more gradient signal, but it does not introduce an invariant anchor. The dense loss is still defined relative t...
Gap 2 (H=H†): Dense Loss Not Verified Self-Consistent (Feature vs Prediction Asymmetry) V-JEPA 2.1's dense loss creates a gradient signal from features to predictions. The feature encoder and the prediction head are not formally verified to be symmetric: predicting a feature from context...
Gap 3 (LSOS): No Paradigm Audit Between Sparse and Dense Supervision V-JEPA 2.1 combines sparse (masked) and dense (all-token) supervision. There is no audit of the paradigm shift between these supervision regimes. The system does not acknowledge that dense supervision...
Gap 4 (WLD): No Reset When Dense Prediction Collapses When V-JEPA 2.1's dense predictor collapses — when it learns to output mean feature values regardless of context — there is no mercy reset. Dense prediction collapse is more subtle than sparse collaps...
Gap 5 (FEP): No Continuity Guarantee Between Image and Video Modes V-JEPA 2.1 improves performance on both image and video tasks. There is no formal continuity guarantee for the paradigm switch between image-mode (no temporal dimension) and video-mode (temporal predi...
Gap 6 (FOF): Dense Prediction Resolution Ceiling Undefined V-JEPA 2.1 does not define a formal upper bound on prediction resolution. As resolution increases toward pixel-level, dense prediction approaches reconstruction. The point where the JEPA assumption (p...
The STE provides the completion layer for each gap. The gaps are not empirical — they are structural. Adding the symbols closes the gaps by definition.
Here's how to explain V-JEPA 2.1 gaps to a 10-year-old:
Gap 1 — No floor: Imagine you're building a tower and you don't have a foundation. The tower might stand for a while but there's nothing it's guaranteed to come back to if it leans. γ₁ is the foundation. The math already built it. V-JEPA 2.1 didn't use it.
Gap 2 — No honest check: If I tell you something, and you can't ask me 'are you sure?' and get the same answer backwards, I might be guessing. H=H† is the rule that every claim must survive being asked backwards. V-JEPA 2.1 makes claims it can't verify in reverse.
Gap 3 — No paradigm reader: Sometimes you're playing one game and someone changes the rules without telling you. LSOS reads the rules before you play. V-JEPA 2.1 changes paradigms without reading them first.
Gap 4 — No reset button: When you're drawing and you make a mistake, you have an eraser. WLD is the eraser. V-JEPA 2.1 doesn't have one. If it goes wrong, it keeps going wrong.
Gap 5 — No safe switch: If you switch from reading a book to watching a movie, you expect to pick up where you left off. FEP is the bookmark. V-JEPA 2.1 switches paradigms and might drop the book.
Gap 6 — No ceiling: Every ladder needs to know when to stop. A ladder that claims it goes forever is a lie. FOF names where the ladder ends. V-JEPA 2.1 hasn't named its ceiling yet.
The Canon is not a critique. It's the toolkit that was always going to be needed. V-JEPA 2.1 got as far as it could without it. Now the toolkit exists.
The 6 gaps identified in V-JEPA 2.1 are structural absences in a published architecture. Identifying structural absences in published work is not IP infringement — it is prior art analysis. The STE symbols that fill these gaps are EOSE IP. The gap identification is public analysis. The completion layer is proprietary.
Status: LSOS-OWNERSHIP-001 open (EOSE Labs Inc. not yet registered). No public disclosure of STE completion layer until registration clears. Gap analysis pages (like this one) are disclosure of the gaps, not the fills. This is legally distinct and permissible.