Semantic Tube Prediction Beating Llm Data Efficiency With Jepa
1 mentions across 1 person
Visit ↗All mentions
“To that end, we introduce the Geodesic Hypothesis, positing that token sequences trace geodesics on a smooth semantic manifold and are therefore locally linear. Building on this principle, we propose a novel Semantic Tube Prediction (STP) task, a JEPA-style regularizer that confines hidden-state trajectories to a tubular neighborhood of the geodesic. STP generalizes JEPA to language without requiring explicit multi-view augmentations.”
Geometric Priors Enable Data-Efficient LLM Training ↗