absorb.md

Semantic Tube Prediction Beating Llm Data Efficiency With Jepa

1 mentions across 1 person

Visit ↗
Yann LeCun
paper · 2026-02-26
Recommended

To that end, we introduce the Geodesic Hypothesis, positing that token sequences trace geodesics on a smooth semantic manifold and are therefore locally linear. Building on this principle, we propose a novel Semantic Tube Prediction (STP) task, a JEPA-style regularizer that confines hidden-state trajectories to a tubular neighborhood of the geodesic. STP generalizes JEPA to language without requiring explicit multi-view augmentations.

Geometric Priors Enable Data-Efficient LLM Training