absorb.md

Yann LeCun

Chronological feed of everything captured from Yann LeCun.

Yann LeCun's X Feed: Acknowledged Utility, Yet Immature

Yann LeCun's X feed, despite its current utility, is perceived as being in an early developmental stage. This suggests that while it provides some value, significant improvements and maturation are anticipated or required. The assessment indicates a gap between its present state and its potential.

Decentralized Control of Superintelligence

Yann LeCun posits that the control of superintelligence will not be centralized under a single individual. This suggests a future where superintelligent AI systems are managed through distributed or collective mechanisms, moving away from a singular authority model. This perspective contrasts with fears of a single entity wielding absolute control over advanced AI.

Over-parameterization, Open-source, and the Future of AI

Deep learning models challenge traditional statistical wisdom regarding over-parameterization, demonstrating that increased model complexity beyond the "double descent" point can lead to improved generalization. The discussion emphasizes the critical role of high-quality data curation and human-in-the-loop feedback in developing robust AI, often under-communicated by leading AI companies. The future of AI is envisioned through open-source foundational models, fostering diverse applications, and arguing against restrictive regulations, stressing the importance of accessibility and decentralized control for societal benefit and preventing monopolization of AI capabilities.

Humorous Self-Deprecation by Yann LeCun

This content is a humorous, self-deprecating post by Yann LeCun, stating 'I'm low IQ.' It's likely an ironic comment, given his prominent status in the AI field. This post provides a lighthearted example of social media interaction from a leading researcher.

Trained AI Models Restricted to Explicitly Taught Questions

AI models, as per Yann LeCun, are currently limited to answering questions for which they have received explicit training. This implies a scope constraint based on their training data and methodology. The claim highlights a fundamental limitation in current AI capabilities regarding generalized knowledge or unforeseen queries.

Language is Neither Necessary Nor Sufficient for Advanced Cognition

Yann LeCun argues that language, while helpful, is not the fundamental basis of thought. He uses an analogy of a roof, which is useful but requires foundations and walls (representing non-linguistic thought structures) to be truly effective. This implies that core cognitive processes operate independently of linguistic expression.

Federally Funded Research Demonstrates Substantial ROI

Yann LeCun refutes claims against federally funded research, asserting its high long-term return on investment. Data suggests a 150-300% ROI, making it a highly effective expenditure. This directly counters arguments that disparage such investments, highlighting their economic benefits.

Ad Hominem Attack on X Platform

The provided content is an ad hominem attack from an X (formerly Twitter) feed. It lacks substantive information, rendering it unusable for knowledge extraction or factual analysis. The sole recoverable information is the nature of the interaction as a direct personal insult.

Language, Thought, and World Models: A Causal Relationship

Yann LeCun posits a hierarchical relationship between language, thought, and mental models. Language serves as a communication medium for thoughts, which are theorized as manipulations of internal world models. This suggests a foundational role for robust world models in the generation of meaningful thought and, subsequently, coherent linguistic expression.

JEPA Architecture for AI Advancement

Yann LeCun

Language Models’ Limitations in General Reasoning

Yann LeCun posits that thinking primarily involves manipulating mental models in an abstract, continuous representation space, rather than relying on language. This suggests that while language models may benefit specific applications like coding and mathematics where language aids reasoning, their utility for general, abstract reasoning is inherently limited by their linguistic nature.

Criticism of Unspecified "BS" in AI/Tech Discourse

Yann LeCun, a prominent figure in AI, expresses strong disapproval of unspecified "BS" on his X (formerly Twitter) feed. This brief but forceful statement suggests a perceived prevalence of misinformation or low-quality content within the AI/tech discourse, though the specific targets or nature of this "BS" are not detailed. The core insight is the existence of significant, unclarified contention from an influential voice.

Humor Detection in AI Models: A Case Study of Yann LeCun's X Feed

This analysis investigates the ability of AI models to interpret and contextualize humor, specifically focusing on the use of "😂😂😂" in social media. The core insight revolves around the limitations of current AI in discerning nuanced human communication and emotional expression. The brevity of the content makes it a challenging case for robust knowledge extraction.

Proposed US Federal Budget Cuts Threaten Systemic Collapse of Scientific Research

Proposed US federal budget cuts under the Trump administration target critical agencies including NASA, NIH, and the NSF. Specifically, the total removal of the NSF's social, economic, and behavioral sciences directorate threatens to dismantle key pillars of the US scientific infrastructure. These measures are viewed by the scientific community as a systemic risk to global research leadership.

Hierarchical Planning Enhances Long-Horizon Control in Latent World Models

Model Predictive Control (MPC) with learned world models struggles with long-horizon tasks due to error accumulation and large search spaces. This work proposes hierarchical planning using latent world models at multiple temporal scales. This approach reduces inference-time complexity and enables long-horizon reasoning, improving zero-shot control capabilities.

Critique of Closed AI Models and Open Source Contribution Imbalance

Yann LeCun asserts that closed AI models unfairly profit from advancements made by open-source models without reciprocating contributions. This creates an imbalance where commercial entities leverage community efforts without giving back to the open AI ecosystem, which could stifle collaborative progress and innovation.

GOP Introduces “America First Award” for Donald Trump, Signaling Deepening Cult of Personality

The Republican Party has created a new "America First Award" and presented it to Donald Trump. This move, celebrated by Speaker Mike Johnson, suggests a solidification of Trump's influence within the party and reinforces the perception of a cult of personality. The award’s presentation, described with opulent language, indicates a strategic effort to further elevate Trump.

LeCun Irony on National Debt

Yann LeCun, a prominent AI researcher, ironically<sup>1</sup> commented "Tired of winning" on a post linking to an article about the US national debt reaching $39 trillion. This suggests a subtle critique of the economic implications of current policies, potentially hinting at a broader concern about unsustainable fiscal trends despite a superficial appearance of success. The "WELP" from the Tennessee Holler adds to the sardonic tone, implying a resigned acknowledgement of the situation. <sup>1</sup> This is an interpretation. LeCun's intent could be multifaceted.

Cognitive-Inspired Autonomous Learning Architectures for AI

Current AI models are limited in autonomous learning. This paper proposes a new architecture inspired by human and animal cognition, integrating observation-based learning (System A) and active behavior-based learning (System B), controlled by internal meta-control signals (System M). The framework aims to enable AI to adapt to dynamic, real-world environments across evolutionary and developmental timescales.

V-JEPA 2.1: Advancing Dense Vision and World Modeling through Self-Supervised Learning

V-JEPA 2.1 is a self-supervised model that achieves state-of-the-art performance in dense visual understanding and world modeling for both images and videos. This is accomplished by integrating a dense predictive loss, deep self-supervision across encoder layers, multi-modal tokenizers, and effective scaling of model capacity and training data. The resulting representations are spatially structured, semantically coherent, and temporally consistent, demonstrating significant improvements across various benchmarks.

Humorous Take on Scientist Compensation vs. Athlete Salaries

Yann LeCun humorously suggests that scientists earning more than professional athletes would be a positive development. This indicates a personal sentiment rather than a factual claim about current compensation or a policy proposal. The statement serves as an expression of an aspirational ideal for the recognition and reward of scientific contributions.

Stabilizing Joint-Embedding Predictive Architectures via Gaussian Latent Regularization

LeWorldModel (LeWM) introduces a streamlined Joint-Embedding Predictive Architecture (JEPA) that achieves stable end-to-end training from pixels by utilizing a simplified two-term loss function. By replacing complex stabilization methods with a Gaussian latent regularizer, it significantly reduces hyperparameter overhead and enables high-speed planning (up to 48x faster than foundation models) while maintaining physical grounding in its latent representations.

Latent Space Learning Outperforms Pixel-Level Prediction for Physical System Representation

Current machine learning approaches for spatiotemporal physical systems primarily focus on next-frame prediction, which is computationally expensive and prone to compounding errors. This research proposes evaluating models on downstream scientific tasks, specifically the estimation of governing physical parameters, to better assess the physical relevance of learned representations. The study demonstrates that latent space learning methods, such as JEPAs, are more effective for these tasks than methods optimizing pixel-level prediction objectives, even outperforming some methods designed specifically for physical modeling.

Temporal Straightening: Enhancing Latent Planning through Curvature Regularization

Latent planning using world models benefits significantly from effective representation learning. While pre-trained visual encoders provide strong semantic features, they often include irrelevant information detrimental to planning. This work introduces "temporal straightening," a novel curvature regularization technique applied to latent trajectories. This method, inspired by human visual processing, aims to create locally straightened latent spaces where Euclidean distance more accurately reflects geodesic distance, thereby improving gradient-based planning stability and success rates in goal-reaching tasks.

Overcoming AI Stupidity: World Models, Self-Supervised Learning, and the Future of Embodied AI

Current AI systems, particularly large language models, are limited by their inability to understand the physical world, reason, plan, and possess persistent memory, leading to what Yann LeCun describes as "stupidity." LeCun advocates for the development of "world models" using self-supervised learning, enabling AI to learn abstract representations from sensory input, predict outcomes, and perform hierarchical planning. This approach is crucial for advancing AI capabilities beyond discrete symbolic reasoning to robust physical world interaction and robotic intelligence.

Yann LeCun's AMI Labs Raises Over $4.5 Billion for AGI Research

Yann LeCun has successfully fundraised over $4.5 billion (post-money) for his new AGI laboratory, AMI Labs. The lab will focus on developing world models, diverging from the current industry trend of large language models. This significant investment underscores a strong belief in LeCun's vision for advancing Artificial General Intelligence through alternative research paradigms.

AMI Labs Secures Record Seed Round to Develop World-Model-Centric AI

Advanced Machine Intelligence (AMI Labs) has completed a €890 million ($1.03 billion) seed funding round, one of the largest ever, to develop a new generation of AI systems. The company's focus is on building universally intelligent systems incorporating world models, persistent memory, reasoning, planning, controllability, and safety. This substantial capital injection positions AMI Labs to aggressively pursue its foundational AI research and development across its global locations.

Transformer Behavior: Decoupling Massive Activations and Attention Sinks

Transformer language models exhibit "massive activations" (extreme outliers in channels for a few tokens) and "attention sinks" (tokens attracting disproportionate attention). While often co-occurring, these phenomena serve distinct functions. Massive activations act globally as implicit model parameters, while attention sinks operate locally, biasing attention heads towards short-range dependencies. Their co-occurrence is an architectural artifact of pre-norm Transformer configurations.

AI+Hardware Co-design: A Decade-Long Roadmap for Sustainable AI Systems

The future of AI requires a unified, long-term vision for AI and hardware co-development, moving beyond fragmented approaches. This roadmap emphasizes scaling efficiency and achieving exponential gains in intelligence per joule, rather than solely focusing on compute consumption. It redefines scaling around energy efficiency, system-level integration, and cross-layer optimization to foster holistic and adaptive AI systems across diverse environments. The paper outlines a 10-year plan for addressing the challenges and opportunities in AI+HW co-design.

Transfusion Framework for Multimodal Pretraining

This paper introduces the Transfusion framework for multimodal pretraining, specifically designed to explore the design space for native multimodal models without prior language pretraining. It details a controlled experimental approach using next-token prediction for language and diffusion for vision, trained on diverse data including text, video, image-text pairs, and action-conditioned video. Key findings address optimal visual representations, data complementarity, world modeling capabilities, and efficient scaling through Mixture-of-Experts.

Yann LeCun's Journey Through AI and the Future of Machine Intelligence

This interview with Yann LeCun traces his personal and professional journey through the field of machine learning, from early neural network research to the modern era of deep learning. LeCun details the historical ebb and flow of neural network popularity, emphasizing key technical advancements and offering a critical perspective on current methodologies. He advocates for a future centered on self-supervised learning and "world models" for more efficient and human-like AI.

Rethinking AI Development: From Artificial General Intelligence to Superhuman Adaptable Intelligence

This paper argues against the prevailing concept of Artificial General Intelligence (AGI) as a flawed and ill-defined goal for AI development. Instead, it proposes a new framework: Superhuman Adaptable Intelligence (SAI). SAI emphasizes specialization and superhuman performance in specific domains, aiming to exceed human capabilities and fill skill gaps. This shift in perspective provides a clearer, more actionable direction for future AI research and development.

Geometric Priors Enable Data-Efficient LLM Training

Large Language Models (LLMs) traditionally adhere to scaling laws that dictate increasing data for improved performance. This work challenges these laws by introducing the Geodesic Hypothesis and a Semantic Tube Prediction (STP) task. STP, a JEPA-style regularizer, constraints hidden-state trajectories to a curved path, enhancing signal-to-noise ratio and diversity, ultimately leading to significant data efficiency gains.

AI's Current State and Future Trajectory: Beyond Language Models

Yann LeCun argues that current AI, particularly LLMs, are primarily advanced information retrieval systems, not truly intelligent entities, and criticizes the anthropomorphization of these systems. He emphasizes that real intelligence involves learning through observation and interaction to build mental models of the world, a capability largely absent in current AI. LeCun envisions AI as an amplifier of human intelligence, acting as a "staff" for individuals, and predicts a gradual, not abrupt, advancement, with long-term technological shifts often underestimated.

Radial-VCReg: Enhancing Representation Learning Through Radial Gaussianization

Self-supervised learning aims to maximize information in representations, but is limited by the curse of dimensionality. Radial-VCReg improves upon existing methods like VCReg by introducing a radial Gaussianization loss. This aligns feature norms with the Chi distribution, a characteristic of high-dimensional Gaussians, leading to more diverse and informative representations by reducing higher-order dependencies.

Causal-JEPA: Enhancing World Models via Object-Centric Latent Interventions

C-JEPA extends masked joint embedding prediction to object-centric representations to better capture interaction-dependent dynamics in world models. By utilizing object-level masking, the architecture forces the inference of states from relational contexts, inducing a causal inductive bias that enhances counterfactual reasoning and drastically reduces the latent feature overhead for agent planning.

Standardizing World Model Research with stable-worldmodel

The stable-worldmodel (SWM) ecosystem addresses the reproducibility crisis in World Model research by providing standardized environments, tools, and baselines. It enables efficient data collection and supports research into robustness and continual learning through controllable environmental factors. SWM offers a unified platform for developing and evaluating World Models, mitigating issues of publication-specific implementations and fostering reusability.

EB-JEPA: Accessible Energy-Based Joint-Embedding for Representation Learning and World Models

EB-JEPA is an open-source library that implements Joint-Embedding Predictive Architectures (JEPAs) for learning representations and world models. JEPAs predict in representation space, avoiding the complexities of generative modeling while capturing semantic features. The library provides modular, single-GPU friendly implementations demonstrating scalability from image-level self-supervised learning to video and action-conditioned world models.

Rectified LpJEPA: Enabling Sparsity in Joint-Embedding Predictive Architectures

Rectified LpJEPA introduces a novel regularization technique, Rectified Distribution Matching Regularization (RDMReg), for Joint-Embedding Predictive Architectures (JEPA). This method addresses the limitation of existing JEPA approaches that favor dense representations by explicitly promoting sparsity. By aligning representations to a Rectified Generalized Gaussian (RGG) distribution, Rectified LpJEPA achieves controllable sparsity while maintaining maximum-entropy properties and competitive performance in image classification tasks.

GRASP: A Parallel Stochastic Gradient Planner for World Models

World models face challenges in planning due to vast search spaces. The GRASP algorithm addresses this by using a differentiable world model for efficient, parallelized optimization. It treats states as "virtual states" with soft dynamics constraints and introduces stochasticity to avoid local optima, outperforming existing planning algorithms in success rate and convergence time on long-horizon tasks.

GMM-Anchored JEPA Improves Self-Supervised Speech Representation

Joint Embedding Predictive Architectures (JEPA) struggle with representation collapse in self-supervised speech learning. GMM-Anchored JEPA addresses this by using a Gaussian Mixture Model (GMM) to generate frozen soft posteriors as auxiliary targets. This method, unlike previous iterative re-clustering approaches, applies a one-time clustering with soft assignments and a decaying supervision schedule, enhancing model stability and performance across various speech tasks.

Representation Autoencoders Outperform VAEs in Large-Scale Text-to-Image Generation

Representation Autoencoders (RAEs) demonstrate superior performance and stability compared to Variational Autoencoders (VAEs) in large-scale text-to-image (T2I) generation. RAEs achieve faster convergence and better generation quality, even with a simplified framework, making them a more robust foundation for T2I models. This success is partly attributed to their ability to operate within a shared representation space for both visual understanding and generation, opening new avenues for unified multimodal models.

Latent Action World Models for In-the-Wild Video Analysis

This paper explores the development of latent action world models capable of operating on "in-the-wild" video data. Traditional world models often necessitate explicit action labels, which are impractical for diverse, real-world scenarios. The research demonstrates that continuous, constrained latent actions can effectively capture the complexity of real-world interactions, even in the presence of environmental noise and varying embodiments across videos. This advancement allows for the potential of learning universal interfaces for planning tasks.

JEPA-WMs: Technical Choices for Efficient Planning in Learned Representation Spaces

Recent advancements in AI aim to develop agents capable of solving diverse physical tasks and generalizing to new environments. A promising approach involves training world models from state-action trajectories for planning. This work characterizes a family of such models as JEPA-WMs, which optimize planning within the learned representation space of the world model to abstract irrelevant details and enhance efficiency. The study investigates the impact of model architecture, training objectives, and planning algorithms on planning success, proposing a model that outperforms established baselines.

Satirical Critique of Authoritarianism vs. European Social Democracies

The provided content, a satirical post quoted by Yann LeCun, juxtaposes the perceived "weakness" of European social democracies (characterized by social benefits, personal freedoms, and stability) with the "strength" of authoritarian regimes (marked by control, fear, and suppression of dissent). It implicitly argues that the stability and freedoms of the former are desirable, while the latter, despite its supposed "strength," leads to oppression and a lack of genuine well-being. The satire highlights the benefits of a society that prioritizes citizen welfare and predictable safety over control and enforced conformity.

Bridging JEPA Models and Action Planning through Value-Guided Representation Learning

This paper proposes an enhancement to Joint-Embedded Predictive Architectures (JEPA) for improved action planning. It addresses the limitation of current JEPA models in supporting effective planning by shaping their representation space. This shaping is achieved by approximating the negative goal-conditioned value function with a distance metric between state embeddings, leading to better performance on control tasks.

LLM Parameter Count Approximates Mouse Brain Synapses

Large Language Models (LLMs) currently possess parameter counts on par with the number of synapses found in a mouse brain. This comparison highlights the significant scale achieved by modern AI models, placing them within a biological order of magnitude relevant to neuroscientific considerations. This suggests a potential, albeit abstract, benchmark for complexity in AI development relative to biological systems.

The Web’s European, Public-Sector Origins

The World Wide Web, a foundational technology for free discourse, originated in a European government research institution. It was developed at CERN by Sir Tim Berners-Lee, emphasizing its non-commercial and publicly funded genesis.

Older entries →