absorb.md

April 19 PM: Mythos alignment paradox & Process supervision gains & Robotics platforms unify

JPMorgan says Anthropic's most aligned model is its biggest risk.

0:00
4:05
In This Briefing
1
Mythos AI Alignment Paradox
One model cannot be both the best aligned and the highest risk, yet JPMorgan ...
0:12
2
Process Supervision for LLM Reasoning
Supervising the step-by-step thinking process beats rewarding only correct fi...
1:31
3
Unified Robotics Learning Platforms
Open platforms like RoboVerse and gbrain are converging on shared benchmarks ...
2:48
6 sources · 5 thinkers

Mythos AI Alignment Paradox

One model cannot be both the best aligned and the highest risk, yet JPMorgan says exactly that about Anthropic's Mythos.

Signal · Top ai-research trend with 22 convergence score. Two thinkers, multiple entries in last 14 hours, triggered by new JPM report surfacing explicit tension right after recent alignment papers.
Key Positions
Michael CembalestMythos is Anthropic's most aligned model and also the biggest alignment-relat...[1]
Ilya SutskeverWeak-to-strong generalization offers a path to superhuman model alignment wit...[2]

Michael Cembalest's JPMorgan report introduces Mythos as a model that excels at identifying system vulnerabilities while raising new alignment challenges. [1] The report explicitly states there is tension because 'Mythos is Anthropic's most aligned AI model and also the biggest alignment-related risk they've created.' Ilya Sutskever has been posting on weak-to-strong generalization as a practical path to superhuman alignment. The positions add up to an emerging view that alignment quality and capability often scale together. A more aligned model is also more powerful, and thus has greater potential for misuse if mechanisms fail. This is not a bug in the report. It is the feature we have to design for. For a founder building defensive cybersecurity tools or deploying internal agents, this means your safety playbook cannot treat alignment and capability as separate dials. They turn together. The evidence from the report suggests we are entering a phase where the race between exploit writing and patching accelerates dramatically. Analogy: it is like training a security guard who becomes so good at spotting weaknesses that you now have to worry about him exploiting them. [2] This thread connects to the others because better reasoning supervision and physical robotics both require the same underlying alignment advances. SO WHAT: Your company's AI deployment timelines and insurance models may need updating within months, not years.

Mythos is Anthropic's most aligned AI model and also the biggest alignment-related risk they've created.
Michael Cembalest [1]
Connects to: This governance tension sets the stakes for how process supervision and robotics platforms must incorporate safety from day one.
Sources (2)
  1. JPMorgan Eye on the Market: Mythos AI — Michael Cembalest
    Mythos is Anthropic's most aligned AI model and also the biggest alignment-related risk they've created.
  2. Ilya Sutskever alignment paper — Ilya Sutskever
    Weak-to-strong generalization as a path to superhuman model alignment

Process Supervision for LLM Reasoning

Supervising the step-by-step thinking process beats rewarding only correct final answers for complex reasoning.

Signal · Strong signal in ai-research with Ilya posting directly on process supervision significantly enhancing reasoning. Three entries, two thinkers converging independently in last 14 hours.
Key Positions
Ilya SutskeverProcess Supervision Significantly Enhances Large Language Model Reasoning ove...[1]
Andrej KarpathyVisual explanations via manim help debug and communicate these process-level ...[2]

Ilya Sutskever has argued in recent posts that supervising the process, the actual chain of thought, produces much stronger reasoning than simply rewarding the model for correct final answers. [1] Andrej Karpathy starred the manim animation engine and has historically used it to make complex mathematical concepts legible. He appears to see visualization as a practical way to inspect and improve these process supervision pipelines. The synthesis is clear: the field is moving from outcome supervision (classic RLHF on final token) to process supervision (rewarding every step of the reasoning trace). This is harder to implement but yields better generalization on hard problems like math and code. For anyone who does not live in the LLM world, what this means is your coding copilot or research assistant can be taught to show its work in ways that are both more correct and more auditable. Analogy: it is like the difference between teaching a student to memorize the answer key versus teaching them how to derive the solution. The latter scales to problems the teacher has never seen. A smart non-specialist should care because this technique could cut the error rate on agentic tasks in your company from 30 percent to single digits, directly affecting which products become reliable enough to ship. [2] This connects to the Mythos thread because stronger process supervision is one proposed solution to the alignment scaling problem.

Process Supervision Significantly Enhances Large Language Model Reasoning
Ilya Sutskever [1]
Connects to: Stronger process supervision offers one concrete technical response to the alignment-risk scaling paradox raised in thread one.
Sources (2)
  1. Ilya Sutskever on process supervision — Ilya Sutskever
    Process Supervision Significantly Enhances Large Language Model Reasoning
  2. Karpathy stars manim — Andrej Karpathy

Unified Robotics Learning Platforms

Open platforms like RoboVerse and gbrain are converging on shared benchmarks and brains for scalable humanoid control.

Signal · Robotics trend at 19 score with 3.6x burst. Three thinkers, five entries including key stars and pushes in last 14 hours, focused on unification after years of fragmented efforts.
Key Positions
Jim FanRoboVerse provides a unified platform, dataset and benchmark for scalable and...[1]
Garry Tangbrain offers an opinionated open agent brain for multi-skill hierarchical lo...[2]
Tobi LütkeStarring both gbrain and related tooling signals serious production interest ...[3]

Jim Fan pushed attention to RoboVerse, explicitly framed as a unified platform, dataset and benchmark for robot learning that aims to end the fragmentation that has slowed progress. [1] Garry Tan has been pushing updates to his gstack and maintains gbrain, described as an opinionated open agent brain for hierarchical multi-skill systems that enable agile humanoid locomotion. Tobi Lütke starred both gbrain and related log navigation tools, suggesting operators who run large warehouses are paying close attention. The positions add up to genuine convergence: after years of labs building one-off simulators and reward functions, the community is standardizing the substrate on which robot policies are trained and evaluated. This is the 'ImageNet moment' for robotics. If successful, it compresses the iteration cycle from months to days. For a founder, this means the timeline for useful warehouse robots or home assistants that can actually locomote and manipulate in messy environments just moved forward. The SO WHAT is concrete: labor shortage solutions in logistics and manufacturing could arrive sooner, changing capex plans and go-to-market for any robotics adjacent startup. [2][3] This thread stands apart from the alignment and reasoning threads but depends on them. Better process-supervised models will likely power the brains running on these unified platforms.

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
Jim Fan [1]
Connects to: This physical embodiment thread depends on the reasoning and alignment advances in the first two threads to succeed at scale.
Sources (3)
  1. Jim Fan stars RoboVerse — Jim Fan
    RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
  2. Garry Tan maintains gbrain — Garry Tan
  3. Tobi Lütke stars gbrain — Tobi Lütke
The Open Question

The open question: If better alignment actually amplifies a model's ability to find vulnerabilities, what entirely new governance approaches do we need before models like Mythos ship?

REZA: JPMorgan says Anthropic's most aligned model is its biggest risk.
MARA: How can both be true at once?
REZA: I'm Reza.
MARA: I'm Mara. This is absorb.md daily.
REZA: Across the ai-research cluster the pattern is clear. JPMorgan's report says Mythos is both Anthropic's most aligned model and its biggest risk.
MARA: So if that's true then every safety technique that assumes alignment and power are separate just broke.
REZA: The report says quote there is no inherent contradiction. A more aligned model could also be more powerful.
MARA: Which honestly is kind of terrifying for anyone shipping agents.
REZA: The crux is whether we can measure and control that coupling. Data says not yet.
MARA: But Ilya is pushing weak-to-strong generalization as one path forward.
REZA: Yes and that might be the empirical question that settles it. Does supervision at weak levels transfer when capability jumps.
MARA: If it does then governance teams at every lab need new playbooks by Q3.
REZA: I did not catch until now that the author is Michael Cembalest. That gives the report extra weight.
MARA: Right so cybersecurity teams should assume the exploit patch race just sped up.
REZA: The second cluster shows Ilya and Karpathy converging on process supervision beating pure outcome rewards for reasoning.
MARA: Okay but if that's true then half the RLHF pipelines shipping this month are using the wrong objective.
REZA: Ilya wrote process supervision significantly enhances large language model reasoning.
MARA: So in plain English that means teach the model to show its work and reward every correct step.
REZA: Karpathy starring manim suggests visualization is how we debug those long traces.
MARA: Which means your internal research agents could become reliable enough for regulated domains.
REZA: But the counter is it is more expensive to label every step. Who benefits if this wins.
MARA: Labs that can afford the labeling cost pull ahead. Everyone else waits for open source versions.
REZA: The evidence tilts toward process supervision on hard tasks. No real counter on that point.
MARA: And that itself is notable. The field quietly shifting objectives.
REZA: Finally robotics thinkers are piling into unified platforms. Jim Fan on RoboVerse, Garry on gbrain, Tobi starring both.
MARA: But the part I keep getting stuck on is whether shared benchmarks actually transfer to real warehouses.
REZA: The signal is hierarchical multi-skill systems for agile locomotion. One brain, many skills.
MARA: So if that's true then Palmer Luckey's multi-robot work and these open brains could cut deployment time in half.
REZA: Tobi starring gbrain tells me operators with real fleets are watching. That is new.
MARA: Your logistics portfolio just got more interesting. Or more urgent depending on your cap table.
REZA: The split is still on sim-to-real gap. Data is thin but direction is clear.
MARA: And combined with process supervision from thread two these robots might actually reason safely.
REZA: Exactly. The threads knit together. Alignment, reasoning, then embodiment.
MARA: Tomorrow watch for new RoboVerse benchmarks or gbrain updates.
MARA: That's absorb.md daily. We ship twice a day, morning and evening, pulling from a hundred and fifty-seven AI thinkers. Subscribe so you don't miss the next one.
Jim Fan
@drjimfan
JPMorgan Chase
@jpmorganchase
Ilya Sutskever
@ilyasut
Garry Tan
@garrytan
Andrej Karpathy
@karpathy