absorb.md — A knowledge graph of what AI thinkers are actually saying

tweet / @AndrewYNg / Feb 25

Inception Labs’ Mercury 2: A Breakthrough in Diffusion LLMs for Faster Inference

Inception Labs has launched Mercury 2, a novel diffusion LLM that achieves significantly faster inference speeds compared to traditional autoregressive LLMs. This development introduces a new paradigm for language model architecture, moving beyond conventional sequential processing toward a more efficient diffusion-based approach. The improvement in speed, specifically a 5x increase over leading speed-optimized LLMs, positions Mercury 2 as a significant advancement in the field, with implications for real-time applications and computational efficiency.

diffusion-llminference-speedai-performancegenerative-modelsllm-architecture

“Mercury 2 is the world’s first reasoning diffusion LLM.”

youtube / AndrewYNg / Feb 25 / failed

Generative Adversarial Networks (GANs) Specialization

youtube / AndrewYNg / Feb 25

Scaling TensorFlow: From Sequential Models to Functional APIs and Distributed Training

Transitioning from sequential to functional APIs in TensorFlow is critical for implementing complex architectures like multi-output object detectors and generative models (VAEs, GANs). Mastery of custom training loops further enables low-level control over loss reduction and distributed training across multi-GPU or TPU hardware. This shift allows developers to move from standard library implementations to research-grade, scalable deep learning models.

tensorflowdeep-learningneural-networkscomputer-visiongenerative-modelsmachine-learning-specializationadvanced-ml

“TensorFlow's Functional API is required to build non-linear models with multiple inputs, multiple outputs, or loops.”

tweet / @AndrewYNg / Feb 23

AI as a creative catalyst for new job roles

AI, exemplified by a bespoke cake design process, can generate new job opportunities by enhancing human creativity. While acknowledging concerns about job displacement, historical precedent suggests technological advancements that amplify human ingenuity lead to overall job growth. Therefore, even in early stages, AI shows potential to expand professional avenues rather than purely diminish them.

ai-jobsjob-creationeconomic-impact-of-aiai-ethicsfuture-of-workandrew-ng

“AI can create new job opportunities by enabling novel creative processes.”

blog / AndrewYNg / Feb 20

The Bifurcation of AI: Edge Democratization vs. Political Consolidation

The AI landscape is shifting toward a bifurcated evolution: the democratization of high-performance reasoning via efficient open-weights and edge-optimized hybrid models (e.g., GLM-5, LFM2.5), and the consolidation of industry power through aggressive political lobbying by 'Big AI' to shape regulatory frameworks. Simultaneously, AI's pattern recognition capabilities are extending into preventative medicine through multimodal sleep-signal analysis.

ai-newslarge-language-modelsai-policyon-device-aiai-healthcare

“Open-weights LLMs are narrowing the performance gap with proprietary frontier models, particularly in agentic and coding tasks.”

blog / AndrewYNg / Feb 20 / failed

Gemini Seizes the Lead, Investors Panic Over Agentic AI, Optimism at Global AI Summit, Local Versus Cloud

blog / AndrewYNg / Feb 13 / failed

Claude Opus 4.6 Thinks Smarter, xAI Joins SpaceX, AI Outperforms Doctors, Standardized AI Audits

blog / AndrewYNg / Feb 13 / failed

The New Open-Weights Leader, Big AI’s Political Influence, Predicting Illness, Faster Reasoning

blog / AndrewYNg / Feb 13

xAI and SpaceX Merge, Aim for Space-Based AI Infrastructure Amidst Industry Skepticism

Elon Musk's SpaceX has acquired xAI, forming the world's most valuable private company and signaling a strategic shift towards space-based AI applications. This merger aims to provide xAI with robust financing to compete with other AI leaders and accelerate the development of orbiting data centers. However, the financial rationale for the acquisition and the feasibility of large-scale space-based data centers face significant skepticism from financial and scientific experts.

ai-ethicsai-policyllm-applicationsai-in-hollywoodai-auditingmedical-aiagentic-ai

“SpaceX acquired xAI, creating the world's most valuable private company at $1.25 trillion, to pursue space-based AI infrastructure development.”

youtube / AndrewYNg / Feb 11

A2A Protocol: Standardizing AI Agent Communication

The A2A protocol, an open standard developed in partnership with Google Cloud and IBM Research, aims to standardize communication between AI agents, regardless of their underlying frameworks. This client-server based protocol enables seamless collaboration, promoting reusability and independent development of agents. Its adoption is positioned to become an industry standard, facilitating complex, multi-agent workflows.

a2a-protocolmulti-agent-systemsagent-communicationgoogle-cloudibm-researchai-agentsopen-source

“A2A is an open protocol that standardizes how AI agents discover and communicate with each other, even if built with different frameworks or by different teams.”

blog / AndrewYNg / Feb 6 / failed

Claude Opus 4.6 Thinks Smarter, xAI Joins SpaceX, AI Outperforms Doctors, Standardized AI Audits

youtube / AndrewYNg / Feb 6

Retrieval Augmented Generation: Enterprise LLM Performance Enhancement

RAG significantly improves large language model performance for enterprise applications by integrating LLMs with trusted databases. This approach enables LLMs to access specialized, up-to-date, and personalized information, facilitating domain-specific answers and informed response generation.

ragllmgenerative-aiai-engineeringai-education

“Retrieval Augmented Generation (RAG) is the most widespread method for enhancing large language model performance in enterprise settings.”

blog / AndrewYNg / Feb 6

Emerging AI Trends: Job Market Shifts, Agentic AI Evolution, and Efficient Model Distillation

The AI landscape is rapidly evolving, impacting the job market by shifting demand towards AI-skilled workers and enabling more efficient team structures. Concurrently, advanced agentic AI systems like OpenClaw and Kimi K2.5 are demonstrating powerful autonomous capabilities and parallel task execution, though they present new security and cost considerations. Innovations in model distillation, exemplified by Mistral AI, are yielding highly capable, smaller models with reduced training costs, driving the potential for on-device AI.

ai-agentsllm-researchai-economyopen-source-aimachine-learning

“AI is creating a demand for new skills and roles while some traditional roles are being automated.”

blog / AndrewYNg / Jan 30 / failed

OpenClaw Runs Amok, Kimi’s Open Model, Ministral Distilled, Wikipedia’s Partners

youtube / AndrewYNg / Jan 23

Rapid AI Development & Deployment: The New Competitive Edge

The current AI landscape is characterized by unprecedented speed in development and deployment, making rapid execution a primary competitive advantage. The ability to iterate quickly and focus on end-user value, rather than just cost savings, is crucial for transformative growth. This new paradigm requires a shift in how companies approach AI adoption, emphasizing technical proficiency across all roles and a deeper workflow redesign rather than incremental efficiency gains from existing processes.

ai-startupsai-ecosystem-europevc-funding-aiai-adoption-businesstechnical-foundersai-infrastructureagentic-ai

“Speed of execution is the primary competitive advantage in the current AI landscape.”

blog / AndrewYNg / Jan 23 / failed

Agents Go Shopping, Intelligence Redefined, Better Text in Pictures, Higher Engagement Means Worse Alignment

youtube / AndrewYNg / Jan 20 / failed

ET@Davos: Andrew Ng on Job Displacement, AGI Myth and India’s Crossroads (ET Digital)

youtube / AndrewYNg / Jan 20

Navigating the AI Revolution: Skill Up or Risk Disruption

Andrew Ng, a prominent AI leader, emphasizes the critical need for individuals and nations to rapidly acquire AI skills to avoid being marginalized by the technology's advancements. He argues that sophisticated AI tool utilization is becoming indispensable across various professions, not just software engineering. Despite the hype surrounding Artificial General Intelligence (AGI), Ng believes current technologies are not a direct path to it, advocating instead for focusing on practical AI applications and upskilling initiatives. He also highlights the geopolitical implications of AI, urging nations to invest in open-source AI models to maintain control over their critical infrastructure and counter the influence of dominant foreign models.

ai-strategyworkforce-upskillingagi-debategeopolitics-of-aigenerative-ai

“Proficiency in AI tools is becoming a mandatory skill across numerous professions, not just software engineering.”

blog / AndrewYNg / Jan 16 / failed

Self-Driving Reasoning Models, ChatGPT Adds Ads, Apple’s Deal with Google, 3D Generation Pronto

paper / AndrewYNg / Jan 9

Profinite Completion Equivalence for Aspherical Manifolds

This paper demonstrates that smooth, closed, connected aspherical manifolds with "good" fundamental groups are cobordant and have congruent signatures modulo 8 if their profinite completions are isomorphic. Additionally, the spin structure is preserved under this isomorphism. The findings extend to compact connected aspherical manifolds, establishing a strong relationship between the algebraic property of profinite completion and the topological properties of cobordism and spin structures.

group-theoryalgebraic-topologygeometric-topologycobordismspin-structuresprofinite-completionsaspherical-manifolds

“If the profinite completions of the fundamental groups of two smooth, closed, connected aspherical manifolds (M and N) are isomorphic, then M and N are cobordant.”

blog / AndrewYNg / Jan 9 / failed

Governments vs. Grok, Meta Buys Agent Tech, Healthcare Chatbots, Limits of AI-Powered Retrieval

blog / AndrewYNg / Jan 2 / failed

LLMs Go To Confession, Automated Scientific Research, What Copilot Users Want, Reasoning For Less

blog / AndrewYNg / Dec 26 / failed

New Year Special! Hopes for 2026 from David Cox, Adji Bousso Dieng, Juan M. Lavista Ferres, Tanmay Gupta, Pengtao Xie, Sharon Zhou

blog / AndrewYNg / Dec 17

LLMs: Advancements, Applications, and Data Integration Challenges

This content explores the current state and future trajectory of Large Language Models (LLMs), highlighting their growing generalization capabilities and the persistent challenges in adapting them to specialized, data-scarce domains. It also covers recent developments in video generation models and OpenAI's new GPT-5.2 suite, showcasing the rapid evolution and diverse applications of AI while underscoring the ongoing need for innovative data-centric approaches to enhance model intelligence and efficiency.

llmsai-capabilitiesmultimodal-aiai-ethicsmodel-deploymentvideo-generationagi-hype

“LLMs demonstrate broader intelligence than previous AI, but are still limited compared to human generalization.”

blog / AndrewYNg / Dec 17 / failed

Insights from Amazon Board Member & DeepLearning.AI Co-founder, Andrew Ng | AWS Events (AWS Events)

blog / AndrewYNg / Dec 10

Iterative Refinement with Tiny Recursive Models Outperforms Large LLMs in Complex Puzzle Solving

Small, specialized neural networks (Tiny Recursive Models or TRMs) employing iterative refinement with context embedding demonstrate superior performance over large language models (LLMs) in visual puzzles requiring precise, multi-element solutions. This approach allows TRMs to iteratively improve solutions and track changes without explicit loss functions, making them more effective and efficient for specific tasks like Sudoku and ARC-AGI benchmarks where a single error invalidates the entire solution.

llm-agentsai-researchllm-benchmarksai-modelsai-in-scienceai-policyautonomous-systems

“Tiny Recursive Models (TRMs) outperform large language models (LLMs) in visual puzzles like Sudoku, Maze, and ARC-AGI benchmarks.”

youtube / AndrewYNg / Dec 5

Operationalizing ML: Transitioning from Model Training to Production Lifecycle Management

The updated Machine Learning in Production course shifts focus from isolated model training to the holistic deployment lifecycle. It provides a framework for project scoping, data management, and operational maintenance to ensure robust model performance in real-world applications.

machine-learningmlopsdeep-learningmodel-deploymentdata-augmentationexperiment-tracking

“The updated 'Machine Learning in Production' course covers the full project lifecycle from scoping to deployment.”

youtube / AndrewYNg / Dec 2 / failed

AI Dev 25 x NYC | Andrew Ng: Opening Keynote

paper / AndrewYNg / Dec 1

Hierarchical Flow Matching for Multi-Scale Climate Emulation

Spatiotemporal Pyramid Flows (SPF) replace slow autoregressive weather-scale emulation with a hierarchical flow matching architecture. By partitioning the generative trajectory into a spatiotemporal pyramid conditioned on physical forcings, the model enables efficient, parallel sampling across multiple temporal and spatial resolutions. Validated on the new ClimateSuite dataset (33k simulation-years), SPF demonstrates superior performance on ClimateBench and strong generalization across diverse climate models.

climate-modelinggenerative-aiflow-matchingspatiotemporal-modelingearth-system-sciencemachine-learning-applications

“Spatiotemporal Pyramid Flows (SPF) outperform existing flow matching baselines and pre-trained models on ClimateBench at monthly and yearly timescales.”

youtube / AndrewYNg / Dec 1

Demystifying ML Math: A New Specialization for AI Professionals

The DeepLearning.AI Mathematics for Machine Learning and Data Science Specialization addresses a critical gap in AI education by providing a foundational understanding of the mathematical and optimization methods underpinning ML and data science algorithms. This program aims to surmount common hurdles in AI career progression, such as interview rejections due to math deficiencies and general apprehension towards the mathematical rigor of the field. It emphasizes practical application through interactive exercises and hands-on labs, covering topics from probability and uncertainty calculation to confidence intervals, hypothesis testing, and linear algebra.

machine-learning-educationdata-science-mathematicsai-career-developmentdeeplearning-aionline-coursesskill-development

“Mathematics is a common barrier for individuals at all stages of their AI career.”

youtube / AndrewYNg / Dec 1 / failed

Generative AI for Everyone, a course from Andrew Ng, is live!

blog / AndrewYNg / Nov 26

AI Investment Asymmetry and the Shift Toward Behavioral Steerability

The AI market is bifurcated: infrastructure for training faces potential bubble risks and eroding moats, while inference capacity and the application layer remain under-supplied and under-invested. Technically, the field is moving toward modularity in deployment (multi-cloud availability for models) and precise behavioral control via persona vector manipulation during inference and fine-tuning.

ai-investmentsllm-infrastructureai-applicationsai-music-generationllm-personalitiesgoogle-geminimicrosoft-anthropic-partnership

“The AI application layer is currently underinvested relative to its potential value.”

youtube / AndrewYNg / Nov 6

Andrew Ng on the Evolution and Future of Agentic AI

Andrew Ng discusses the landscape of agentic AI, emphasizing its iterative, multi-step prompting approach for complex workflows. He highlights the divergence in memory architectures driven by diverse use cases and advocates for a multi-model future over a single, all-encompassing AI. Ng also provides insights into AI adoption in enterprises, advocating for application-driven data infrastructure development and data ownership.

agentic-aideep-learningllm-frameworksknowledge-graphsai-adoptiondata-managementai-transformation

“Agentic AI is characterized by iterative, multi-step prompting and tool-calling to build complex workflows.”

paper / AndrewYNg / Nov 1

STARC-9: A Diverse Dataset for Colorectal Cancer Histopathology Classification

STARC-9 is a new large-scale dataset for multi-class tissue classification in colorectal cancer (CRC) histopathology. It addresses limitations of existing datasets by providing morphologically diverse, high-quality image tiles across nine clinically relevant tissue classes. The dataset was constructed using DeepCluster++, a novel semi-automated framework that ensures intra-class diversity and reduces manual curation, improving model generalizability for downstream machine learning applications.

colorectal-cancerhistopathologymedical-imagingdeep-learningdataset-creationcancer-diagnosis

“Existing public CRC datasets lack morphologic diversity, suffer from class imbalance, and contain low-quality image tiles.”

paper / AndrewYNg / Oct 2

Profinite Criterion for Primitive Words in One-Relator Groups with Torsion

This paper introduces a method for identifying surface subgroups within specific one-relator groups that possess torsion. This discovery leads to the derivation of a profinite criterion. This criterion ascertains whether a given word in a free group is primitive, offering a novel tool for analyzing group structures.

group-theorymathematicsone-relator-groupssurface-subgroups

“Surface subgroups can be found in certain one-relator groups with torsion.”

paper / AndrewYNg / Aug 25

UQ: A Novel Benchmark for Language Model Evaluation on Unsolved Questions

Traditional AI benchmarks struggle with a difficulty-realism trade-off. This paper introduces UQ, a new paradigm that evaluates language models on unsolved, real-world questions. UQ leverages a community-driven, asynchronous evaluation process with validator-assisted screening to assess frontier models on challenging and diverse problems. This approach aims to provide a more realistic and impactful measure of model capabilities.

llm-evaluationbenchmarkingunsolved-questionshuman-machine-collaborationai-testingmodel-assessmentllm-challenges

“Current AI benchmarks face a difficulty-realism tension, being either artificially difficult but unrealistic or easy with limited real-world value.”

youtube / AndrewYNg / Aug 21 / failed

No Priors Ep. 128 | With Andrew Ng, Managing General Partner at AI Fund (No Priors Podcast)

paper / AndrewYNg / Jul 4

LLMs Enable Semi-Automatic Ontology Generation from Lab Automation XML Schemas

The RELRaE framework uses LLMs across multiple pipeline stages — extraction, labelling, refinement, and evaluation — to surface implicit relationships within XML schemas produced by robotic laboratory systems. The goal is to enrich these schemas into ontology-ready knowledge graphs, enabling data interoperability across labs. The work demonstrates that LLMs can accurately generate and self-evaluate relationship labels in a domain-specific, structured-data context, supporting broader semi-automatic ontology construction workflows.

llm-applicationsknowledge-graphsinformation-extractionontology-generationlab-automationrelationship-extractionxml-data

“LLMs can accurately extract and label implicit relationships present in XML schemas from robotic lab experiments.”

youtube / AndrewYNg / Jul 1

Andrew Ng Debunks "AI Will Automate Coding" Myth, Highlights Agentic Workflows & US Competitiveness Concerns

Andrew Ng argues that learning to code remains a crucial skill, as individuals proficient in computer languages will leverage AI more effectively. He advocates for "agentic workflows," where AI iteratively develops solutions, and expresses concern over US national competitiveness in AI due to immigration policies, underinvestment in science, and reliance on foreign semiconductor manufacturing. Ng emphasizes the need to build trust in AI benefits and encourages immediate application of current AI capabilities rather than waiting for AGI.

ai-policyai-educationai-competitivenessai-innovationagentic-aicareer-adviceworkforce-development

“The belief that AI will automate coding, making it unnecessary to learn, is flawed career advice.”

youtube / AndrewYNg / Jun 17

Speed as the Primary Driver of AI Startup Success

AI fund analyzes startup success factors, identifying execution speed as a key predictor. New AI technologies significantly accelerate this, making it critical for startups. The biggest opportunities lie at the application layer, fueled by agentic AI allowing iterative workflows. Concrete ideas, rapid engineering, swift product feedback, and deep AI understanding are paramount for speed.

ai-startupsai-agentsstartup-strategyproduct-market-fitexecution-speedai-innovationdeveloper-productivity

“Execution speed is a strong predictor of an AI startup's success.”

youtube / AndrewYNg / Jun 1 / failed

AI Fund’s GP, Andrew Ng: LLMs as the Next Geopolitical Weapon & Do Margins Still Matter in AI? (20VC Podcast)

paper / AndrewYNg / May 29

Vanishing Virtual First Betti Number in Group Theory

This paper introduces a new criterion for determining when groups have a vanishing virtual first Betti number. This criterion is then applied to construct new examples of torsion-free, finitely generated, residually finite groups that are not virtually diffuse. This work directly addresses and resolves a question posed by Kionke and Raimbault, contributing to the understanding of group properties in abstract algebra.

group-theorymathematicsvirtual-first-betti-numberggs-groupskionke-raimbault-questionarxiv-math

“A novel criterion for groups to exhibit a vanishing virtual first Betti number has been identified.”

youtube / AndrewYNg / May 28

Andrew Ng on Agentic AI: Spectrum Thinking, Voice Stacks, and the Underrated Skills Builders Are Missing

Andrew Ng argues that framing AI systems as "agentic" on a spectrum — rather than debating whether something qualifies as an "agent" — is more productive and better reflects real-world deployment, where most business opportunities are linear or near-linear workflows rather than complex autonomous loops. He identifies systematic evals and voice stack development as critically underrated skills, while warning that the tactile judgment required to diagnose and improve agentic pipelines remains scarce and hard to transfer. On infrastructure, Ng views MCP as a strong first step toward n+m (rather than n×m) data integration effort, while agent-to-agent interoperability across teams remains largely unproven in practice.

agentic-workflowsllm-toolingai-evalsvoice-aimcp-protocolvibe-codingstartup-strategy

“Most real-world agentic business opportunities are linear or near-linear workflows, not complex autonomous loops, and this segment is still largely underbuilt.”

youtube / AndrewYNg / May 23 / failed

The U.S. Needs Advanced Manufacturing Skills: Andrew Ng (Bloomberg Technology)

youtube / AndrewYNg / Apr 1 / failed

Hey Perplexity, Who is the Top AI Person in the World?... Answer: Andrew Ng | ASU+GSV Summit 2025 (ASU+GSV Summit)

youtube / AndrewYNg / Mar 27

The Golden Age of AI Building: Leveraging Accessible Tools and AI-Assisted Coding for Accelerated Innovation

The current landscape presents an unprecedented opportunity for AI developers due to two converging factors: the readily available and affordable "Lego bricks" of AI technology (foundation models, cloud services, etc.), and the transformative impact of AI-assisted coding. This synergy dramatically lowers the barrier to entry and significantly accelerates the prototyping and development process, enabling rapid iteration and fostering a new era of invention.

ai-developmentai-toolsllm-applicationsdeveloper-productivityai-coding-assistantsprototypinginnovation

“The present era is the 'best time ever' to be an AI builder due to readily available technological components and AI-assisted coding.”

paper / AndrewYNg / Jan 24

MedAgentBench: A Virtual EHR Environment for LLM Agent Benchmarking

MedAgentBench is a novel, comprehensive evaluation suite designed to benchmark large language model (LLM) agents in medical record contexts. It provides a standardized environment for assessing LLM capabilities in complex, interactive healthcare tasks, addressing a critical gap in current evaluation methodologies. The platform is FHIR-compliant and aims to facilitate continuous improvement in medical LLM agent development.

medical-llmsagent-benchmarkingelectronic-health-recordsfhir-standardai-agentshealthcare-ai

“MedAgentBench addresses the lack of a standardized dataset for benchmarking LLM agent capabilities in medical applications.”

youtube / AndrewYNg / Dec 3

Agentic AI Workflows Outperform Model Chasing for Enterprise Value

Companies should prioritize building applications with agentic workflows using readily available models like GPT 3.5, as this approach has proven to outperform even more advanced models like GPT 4 in zero-shot scenarios. The cost of generative AI APIs is rapidly decreasing, making it more accessible, and focusing on creating valuable applications first, rather than premature cost optimization or chasing the latest foundational models, is the most effective strategy for most enterprises. The success of agentic workflows stems from their ability to break down complex tasks, generate code, and iterate, significantly lowering the technical barrier for developers in various AI applications, including vision AI.

ai-agentsllm-applicationsvision-aiunstructured-datacost-optimizationdeep-learningstartup-strategy

“Agentic workflows with GPT 3.5 can outperform GPT 4 in certain tasks.”

youtube / AndrewYNg / Dec 3

Agentic Workflows Outperform Model Sophistication in LLM Applications

For most enterprises, prioritizing agentic workflows with less advanced models (e.g., GPT-3.5) yields better results than solely pursuing the latest, most powerful foundational models (e.g., GPT-4) through zero-shot approaches. The rapidly decreasing cost of LLM APIs further supports focusing on building valuable applications and optimizing costs only after achieving product-market fit. This strategy proves more effective for businesses without multi-billion dollar R&D budgets to compete with leading AI labs.

ai-agentsllm-applicationsagentic-workflowsgenerative-aivision-aicost-optimizationunstructured-data

“Agentic workflows with less advanced LLMs can outperform more advanced models used with a zero-shot approach.”