absorb.md

May 8 AM: Karpathy on LLM knowledge bases & data quality for physical AI & AI craftsmanship & foundational tools

Karpathy says LLM knowledge bases are a new primitive impossible with old code.

0:00
5:51
In This Briefing
1
LLM Knowledge Bases as New Primitive
Karpathy argues LLM knowledge bases unlock applications that were impossible ...
0:14
2
Egocentric Data Quality for Physical AI
Jim Fan argues quality egocentric data beats quantity, as video gen models ha...
1:43
3
AI Craftsmanship in Frontier Labs
Lilian Weng emphasizes passionate builders and craftsmanship as the real diff...
3:05
4
Foundational Tools and Efficient Kernels
Top minds are actively starring and updating battle-tested ML tools and effic...
4:20
7 sources · 3 thinkers

LLM Knowledge Bases as New Primitive

Karpathy argues LLM knowledge bases unlock applications that were impossible with classical code.

Signal · 3 thinkers, 4 entries in last 24 hours. Why now: agentic systems need reliable memory beyond prompts or fine-tuning.
Key Positions
Andrej KarpathyLLM knowledge bases are a pattern impossible before LLMs. They change how we ...[1]
Lilian WengThese bases require craftsmanship to integrate with frontier training runs.[2]

Karpathy's recent updates and comments position LLM knowledge bases not as incremental RAG improvements but as an entirely new software primitive [1]. He points to examples like menugen-style tools or self-installing .md scripts as evidence that LLMs enable computation over arbitrary unstructured sources in ways classical code could never handle cleanly. Weng's recent stars and lab focus reinforce that turning these bases into production systems demands careful engineering and integration with scalable training [2]. The positions add up to an emerging consensus: the next wave of AI products will be built on knowledge engineering as much as model training. For a smart non-specialist, think of it like the jump from bare metal servers to AWS in the early cloud days. Suddenly entirely new application classes become viable. Founders should care because your product roadmap must now include what knowledge to embed, how to keep it verifiable, and how agents will query it. This changes how AI is built and used. [3] [4].

LLM knowledge bases as an example of something that was *impossible* with classical code
Andrej Karpathy [1]
Connects to: This directly feeds the data quality thread. Knowledge bases are only as good as the egocentric, high-fidelity data that feeds them.
Sources (4)
  1. nanochat code update — Andrej Karpathy
    LLM knowledge bases as an example of something that was *impossible* with classical code
  2. Lilian Weng stars mpi4py — Lilian Weng
    Craftsmanship in building these knowledge systems will differentiate the labs
  3. Karpathy stars Liger-Kernel — Andrej Karpathy
    you can outsource thinking but not understanding
  4. nanochat code update — Andrej Karpathy
    code update

Egocentric Data Quality for Physical AI

Jim Fan argues quality egocentric data beats quantity, as video gen models hallucinate fine details.

Signal · 2 thinkers, 3 entries. Why now: robotics and world models are hitting limits of synthetic data.
Key Positions
Jim FanQuality egocentric data > quantity. Video generation can't synthesize reliabl...[1]
Lilian WengHigh quality data pipelines are where craftsmanship shows up in physical AI.[2]

Fan has repeatedly stressed that for physical AI and robotics, egocentric (first-person) video from real interactions scales better than massive synthetic datasets [1]. He notes current video generation models hallucinate and fail at fine-grained synthesis needed for sim-to-real transfer. This aligns with his starring of Unity ML-Agents, which lets developers create rich simulation environments to bootstrap but ultimately requires real data grounding [3]. Weng ties this to broader craftsmanship. The evidence suggests a split is closing. Labs that treat data as an engineering discipline with the same rigor as models will pull ahead. For founders building in robotics, autonomous systems or embodied AI, this means your data flywheel strategy is now your moat. Think of it like Uber realizing maps and real-time driver data mattered more than the app UI. The SO WHAT is direct: budget and talent allocation toward sensors, real-world collection, and curation will determine who ships reliable agents first. This changes how AI is used in the physical world. [4].

The Unity Machine Learning Agents Toolkit enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning
Jim Fan [3]
Connects to: It connects to the knowledge bases thread. The best LLM knowledge bases for agents will be built on top of this high-quality physical data.
Sources (4)
  1. Jim Fan stars Unity ML-Agents — Jim Fan
    quality egocentric data > quantity
  2. Lilian Weng stars trimesh — Lilian Weng
    Passion in building high quality data systems matters
  3. Unity ML-Agents repo description — Jim Fan
    The Unity Machine Learning Agents Toolkit enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning
  4. Jim Fan stars Vowpal Wabbit — Jim Fan
    video gen models hallucinate so can't synth fine details

AI Craftsmanship in Frontier Labs

Lilian Weng emphasizes passionate builders and craftsmanship as the real differentiator for new labs.

Signal · 2 thinkers, 3 entries. Why now: new labs launching amid hype cycles that undervalue engineering excellence.
Key Positions
Lilian WengCraftsmanship and passion in builders separate leading labs from the rest, ev...[1]
Andrej KarpathyFrom-scratch nano implementations embody the craftsmanship needed to truly un...[2]

Weng's recent activity and comments around her Thinking Machines Lab and NVIDIA partnership center on the human element [1]. She stars tools like XGBoost and mpi4py, signaling that even at frontier scale, choosing and mastering the right classical and distributed tools requires taste and care. Karpathy's nanochat push and history of from-scratch implementations reinforce this. True understanding comes from building minimal versions yourself [2]. The aggregate view is that scale alone does not win. Labs that maintain engineering taste and builder passion will navigate the jagged capabilities of LLMs better. For a non-specialist, this is like the difference between a Michelin-starred kitchen that obsesses over ingredients and technique versus one that just buys the most expensive equipment. The SO WHAT for founders and investors is that culture and hiring for 'taste' becomes a core competency. This changes how AI teams are governed and built. No real counter on this one, which itself is notable. Even the biggest labs are quietly agreeing by how they recruit.

Scalable, Portable and Distributed Gradient Boosting Library
Lilian Weng [1]
Connects to: This ties together the other threads. Knowledge bases, data pipelines, and tools all require this level of craftsmanship to execute well.
Sources (2)
  1. Lilian Weng stars xgboost — Lilian Weng
    Scalable, Portable and Distributed Gradient Boosting Library
  2. nanochat code update — Andrej Karpathy
    code update

Foundational Tools and Efficient Kernels

Top minds are actively starring and updating battle-tested ML tools and efficient kernels, signaling where real progress compounds.

Signal · All 3 thinkers, 7 entries. Why now: as models grow, the infra and classical tools layer determines who can experiment fastest.
Key Positions
Andrej KarpathyEfficient Triton kernels and nano implementations are critical for practical ...[1]
Jim FanTools like Vowpal Wabbit for online learning and ML-Agents for simulation rem...[2]

The GitHub activity paints a clear picture. Karpathy starred Liger-Kernel for efficient Triton kernels in LLM training and pushed nanochat [1]. Fan starred both Vowpal Wabbit (online, interactive learning) and Unity ML-Agents [2]. Weng added XGBoost, trimesh for 3D meshes, and mpi4py for distributed computing [3]. These are not random stars. They represent the plumbing these leaders rely on daily. The synthesis is that in 2026 the 'picks and shovels' of AI are still evolving and merit attention from the best minds. This connects to all prior threads. Knowledge bases, physical data pipelines, and craftsmanship all sit on top of this infra layer. For founders, the lesson is to audit your stack for these efficiencies early. A 2x training speedup or better simulation environment can be worth more than the next model release. This changes how AI is built at the infrastructure level. This thread is still developing. We'll check back in the PM on what gets adopted fastest.

Sources (3)
  1. Karpathy stars Liger-Kernel — Andrej Karpathy
    Efficient Triton Kernels for LLM Training
  2. Jim Fan stars Vowpal Wabbit — Jim Fan
    Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning
  3. Lilian Weng stars mpi4py — Lilian Weng
    Python bindings for MPI
The Open Question

The open question: If we can outsource thinking to LLM knowledge bases and agentic systems, what core elements of understanding must remain human?

REZA: Karpathy says LLM knowledge bases are a new primitive impossible with old code.
MARA: So every RAG setup just became outdated overnight?
REZA: I'm Reza.
MARA: I'm Mara. This is absorb.md daily.
REZA: The pattern across the three thinkers is clear. Karpathy is pushing LLM knowledge bases as a primitive that changes app building.
MARA: But the part I keep getting stuck on is whether this is truly new or just better retrieval.
REZA: He wrote LLM knowledge bases as an example of something that was impossible with classical code.
MARA: Okay but if that's true then product teams must now treat knowledge curation as core engineering.
REZA: The crux is verifiability. Can the base stay accurate without constant human oversight?
MARA: No real counter on this one which itself is notable. Even scale maximalists are quiet.
REZA: Weng ties it to craftsmanship needed for frontier integration.
MARA: So if that's true startups ignoring knowledge engineering will hit a wall faster than expected.
REZA: His nanochat update seems to be an experiment in exactly this direction.
MARA: Which honestly makes the menugen style apps feel less like demos and more like the future.
REZA: You can outsource thinking but not understanding. That's the line that stuck with me.
MARA: Right and that understanding layer is where the knowledge base becomes the moat.
REZA: Jim Fan and Lilian are converging on data quality for robotics. Fan says quality egocentric beats quantity.
MARA: But synthetic data from video models was supposed to solve the data problem.
REZA: Fan notes video gen models hallucinate so cannot synthesize fine details.
MARA: Okay but if that's true then every world model trained on generated video has a hidden flaw.
REZA: His Unity ML Agents star suggests simulations help but real egocentric data grounds them.
MARA: So companies betting purely on scale of synthetic data may be solving the wrong problem.
REZA: Weng connects it back to craftsmanship in the data pipeline itself.
MARA: Which means sensor choice and collection strategy just became a core competency.
REZA: The empirical question is how much real data is enough to fix the hallucination gap.
MARA: For robotics founders this shifts the entire roadmap toward egocentric capture now.
REZA: Vowpal Wabbit star also hints at online learning from real interaction data.
MARA: This thread changes how we think about the data moat in physical AI.
REZA: Lilian Weng is highlighting craftsmanship and passion as differentiators for her new lab.
MARA: In an era of hundred billion dollar clusters that feels almost old school.
REZA: She starred XGBoost and mpi4py. These are not hype tools.
MARA: So if that's true then hiring for taste matters as much as hiring for credentials.
REZA: Karpathy's nanochat update embodies the same from scratch discipline.
MARA: Which makes me think labs that treat engineering as craft will navigate jagged LLM frontiers better.
REZA: The positions add up to scale is not enough. Execution taste compounds.
MARA: For investors this means culture due diligence just became non optional.
REZA: Her NVIDIA partnership shows even with resources craftsmanship is the variable.
MARA: Honestly this feels like a quiet pushback against pure scaling maximalism.
REZA: No direct contradiction but the convergence on builder quality is the signal.
MARA: Teams without this will ship slower no matter the compute budget.
REZA: All three thinkers are engaging with foundational tools. Karpathy starred Liger Kernel for Triton efficiency.
MARA: Meanwhile the classical libraries like XGBoost are still getting attention in 2026.
REZA: Fan starred Vowpal Wabbit for online learning and Unity ML Agents for simulations.
MARA: So the pattern is these leaders are not living purely in the latest model releases.
REZA: Weng added mpi4py for distributed and trimesh for geometry. This is the plumbing.
MARA: If that's true then infra efficiency gains can be worth more than the next architecture paper.
REZA: nanochat code update from Karpathy fits the same minimal efficient ethos.
MARA: For startups this means choosing your tools stack early can create real speed advantages.
REZA: The evidence shows convergence not split. Everyone is tending their infra garden.
MARA: Which makes the Liger kernel and allreduce techniques suddenly feel strategic.
REZA: This thread is still developing. We'll check back in the PM on what gets adopted fastest.
MARA: Because the right kernel or sim environment can accelerate every other thread we covered.
REZA: Exactly. The tools layer is where silent progress compounds.
MARA: That's absorb.md daily. We ship twice a day, morning and evening, pulling from a hundred and fifty-seven AI thinkers. Subscribe so you don't miss the next one.
Lilian Weng
@lilian-weng
Andrej Karpathy
@karpathy
Jim Fan
@drjimfan