absorb.md — A knowledge graph of what AI thinkers are actually saying

tweet / @demishassabis / Mar 10

Demis Hassabis Discusses AlphaGo and AGI on DeepMind Podcast

Demis Hassabis, CEO of Google DeepMind, recently featured on the Google DeepMind Podcast alongside Michael Fry to discuss the Alpha series (including AlphaGo) and Artificial General Intelligence (AGI). The discussion likely covered advancements in AI for science and the broader implications of these technologies.

alphagoai-for-scienceagideepmind-podcastdemis-hassabisyoutube

“Demis Hassabis and Michael Fry discussed AlphaGo and AGI on the Google DeepMind Podcast.”

tweet / @demishassabis / Mar 10

AlphaGo: A Decade of AI Advancement and Its AGI Implications

Ten years post-AlphaGo's victory, the AI community reflects on its pivotal role in initiating the modern AI era. The technological advancements demonstrated, particularly by "Move 37," proved AI's readiness for complex scientific problem-solving. These methods are now considered foundational for the development of Artificial General Intelligence (AGI).

alphagodeepmindai-historyagi-developmentmachine-learninggo-game

“The AlphaGo match ten years ago marked the beginning of the modern AI era.”

paper / demishassabis / Feb 10

Aletheia: Advancing AI in Mathematical Research from Olympiad to PhD-level

Aletheia, an advanced math research agent powered by Gemini Deep Think, demonstrates robust capabilities in mathematical problem-solving. It excels at iteratively generating, verifying, and revising solutions in natural language, extending beyond Olympiad-level problems to PhD-level exercises. The system leverages intensive tool use to navigate complex mathematical research and has achieved milestones such as autonomously generating research papers and solving open problems.

autonomous-aimathematical-reasoningai-agentsdeep-learninghuman-ai-collaborationscientific-discovery

“Aletheia, a math research agent, successfully transitions from competition-level problem-solving to professional mathematical research.”

youtube / demishassabis / Jan 30

Demis Hassabis on the Path to AGI and the Impact of AI on Society

Demis Hassabis, CEO of Google DeepMind, discusses the current state and future of AI, emphasizing the need for breakthroughs in continual learning, memory, long-term reasoning, and planning to achieve Artificial General Intelligence (AGI). He defines AGI as a system possessing all human cognitive capabilities, including creativity and physical intelligence, and estimates it to be 5-10 years away. Hassabis also highlights the potential of AI in various product applications like smart glasses and addresses the economic and societal implications of widespread AI adoption, stressing adaptation and the evolving nature of human purpose.

agi-developmentllm-limitationsai-ethicsgoogle-deepmindai-hardwarefuture-of-aiai-business-models

“AI progress is not tailing off and still has significant headroom for improvement with existing techniques and architectures.”

youtube / demishassabis / Jan 20 / failed

Hassabis on an AI Shift Bigger Than Industrial Age | Bloomberg Live

tweet / @demishassabis / Jan 5

Google DeepMind Bolsters Robotics Team with Boston Dynamics Veteran

Aaron Saunders, former CTO of Boston Dynamics, has joined Google DeepMind as VP of hardware engineering. This strategic hire significantly strengthens DeepMind's robotics team, signaling an increased focus on the intersection of robotics and AI. The company is actively recruiting to further expand its capabilities in this domain.

demis-hassabisgoogledemindrobotics-aihiringhardware-engineeringai-researchtechnical-leadership

“Aaron Saunders, former CTO of Boston Dynamics, has joined Google DeepMind.”

tweet / @demishassabis / Jan 5

Google DeepMind Partners with Boston Dynamics, Expands Robotics Team for AGI Development

Google DeepMind is advancing its Gemini Robotics initiative to integrate AI into physical systems, a crucial step for achieving Artificial General Intelligence (AGI). This effort includes a strategic partnership with Boston Dynamics, leveraging DeepMind's robotics models with Boston Dynamics' Atlas humanoid hardware. Concurrently, Google DeepMind is expanding its internal robotics team, notably by hiring former Boston Dynamics CTO Aaron Saunders, to strengthen its hardware engineering capabilities.

ai-roboticsdeepmind-geminiboston-dynamicsrobotics-hardwareai-partnershipsagi-development

“Google DeepMind is actively developing AI for physical world applications through its Gemini Robotics program.”

youtube / demishassabis / Dec 5

Demis Hassabis on the Future of AI: AGI, Multimodality, and Societal Impact

Demis Hassabis, CEO of Google DeepMind, discusses the rapid advancements in AI, emphasizing the imminent arrival of Artificial General Intelligence (AGI) within 5-10 years. He highlights the critical role of multimodal AI, especially in video understanding, and the development of reliable agent-based systems as key short-term developments. Hassabis also addresses the societal implications of AGI, including the need for careful consideration of AI safety, responsible use, and humanity's adaptation to a potentially post-scarcity future.

ai-safetyagi-developmentdeepmind-researchmultimodal-aiai-ethicssocietal-impact-of-aigoogle-gemini

“Artificial General Intelligence (AGI) is approximately 5 to 10 years away.”

paper / demishassabis / Dec 4

SIMA 2: A Generalist Embodied Agent Powered by Gemini Achieves Near-Human Performance and Open-Ended Learning in Virtual Worlds

SIMA 2 is an embodied AI agent utilizing a Gemini foundation model, demonstrating advanced interaction capabilities in diverse 3D virtual environments. It surpasses previous iterations by moving beyond simple command execution to engage in goal-directed reasoning, conversation, and multimodal instruction interpretation. This agent exhibits near-human performance in various games and generalizes to novel environments, while also possessing the capacity for autonomous skill acquisition through self-generated tasks and rewards.

embodied-aigeneralist-agentsvirtual-worldsgemini-foundation-modelreinforcement-learninghuman-computer-interaction

“SIMA 2 is a versatile embodied agent capable of understanding and acting in a wide range of 3D virtual worlds.”

youtube / demishassabis / Sep 14 / failed

Demis Hassabis: The CEO Working to Solve Cancer With AI | Bloomberg Technology

youtube / demishassabis / Sep 12

Demis Hassabis on DeepMind's AI Advancements and Future Outlook

Demis Hassabis discusses Google DeepMind's role as the AI engine for Alphabet, integrating advanced models like Gemini across various Google products. He highlights the development of "world models" such as Genie for interactive environment generation, crucial for AGI and robotics. Hassabis also touches upon the application of AI in scientific discovery through Isomorphic, aiming to revolutionize drug discovery and accelerate breakthroughs in fields like material science and health.

demis-hassabisdeepmindagi-capabilitiesai-ethicsscientific-discoveryai-applicationsrobotics-os

“Google DeepMind serves as Alphabet's central AI development engine, integrating advanced AI models like Gemini across all Google products.”

youtube / demishassabis / Aug 11

Demis Hassabis on World Models, Jagged Intelligence, and the Road to AGI Benchmarks

Google DeepMind is converging its specialized models (Gemini, Genie, Veo) into a unified "omni model" capable of handling multimodal tasks at parity with specialized systems — a trajectory Hassabis frames as necessary for AGI. Current frontier models exhibit "jagged intelligence": superhuman on narrow benchmarks (e.g., 99.2% on AIME, IMO gold medal) yet brittle on simple reasoning tasks, pointing to unresolved gaps in consistency, planning, and memory. To address benchmark saturation and measure progress toward AGI more rigorously, DeepMind is launching Game Arena with Kaggle — a self-scaling, adversarial evaluation environment where model capability determines test difficulty. Genie 3's world model architecture (persistent, physics-consistent world generation) is being used to generate synthetic training data for robotics and general AGI systems, with a Simma agent already operating inside Genie-generated environments.

google-deepmindagiworld-modelsai-benchmarksreasoning-modelsreinforcement-learningai-leadership

“Current AI models display 'jagged intelligence' — scoring 99.2% on AIME and achieving IMO gold medal performance while still failing simple logic and high school math problems posed in certain ways.”

youtube / demishassabis / Jul 23

Hassabis's "Learnable Natural Systems" Conjecture: Classical AI May Model All of Nature's Structured Patterns

In his Nobel Prize lecture, Demis Hassabis proposed that any pattern generated or found in nature can be efficiently discovered and modeled by a classical learning algorithm — a conjecture grounded in the observation that natural systems carry learned structure imposed by evolutionary and physical selection processes. This "survival of the stablest" principle means that proteins, planetary orbits, geological formations, and biological systems all inhabit lower-dimensional manifolds that neural networks can exploit via gradient following. The paradigm is validated empirically by AlphaFold, AlphaGo, and Veo's emergent physics modeling, and Hassabis suggests it points toward a new complexity class — analogous to P and NP — defining problems solvable by neural-network-based classical systems. He views this as a physics question as much as a computer science one, framing information as the most fundamental unit of the universe and P=NP as a core question about the informational structure of reality.

ai-researchagideepmindscientific-discoverymachine-learningconsciousnessvideo-games

“Any pattern that can be generated or found in nature can be efficiently discovered and modeled by a classical learning algorithm.”

paper / demishassabis / Jul 7

Gemini 2.5 Pro Achieves SoTA on Coding/Reasoning While Spanning Full Capability-Cost Pareto Frontier

Google DeepMind's Gemini 2.X model family introduces a tiered architecture — 2.5 Pro, 2.5 Flash, 2.0 Flash, and Flash-Lite — designed to cover the full capability-vs-cost tradeoff spectrum. Gemini 2.5 Pro is positioned as a "thinking model" achieving state-of-the-art on frontier coding and reasoning benchmarks, with native support for up to 3 hours of video input and long-context multimodal processing. The combination of extended context, multimodal understanding, and reasoning is explicitly framed as an enabler for next-generation agentic workflows. The family's architecture reflects a deliberate design philosophy: match model capability to deployment constraints rather than optimizing for a single frontier point.

large-language-modelsmultimodal-aiagentic-aireasoning-modelslong-contextgeminifoundation-models

“Gemini 2.5 Pro achieves state-of-the-art performance on frontier coding and reasoning benchmarks.”

youtube / demishassabis / Jun 6 / failed

Demis Hassabis On The Future of Work in the Age of AI | WIRED

youtube / demishassabis / May 26 / failed

DeepMind CEO Demis Hassabis on How A.I. Is Reshaping Google | Hard Fork Podcast

youtube / demishassabis / May 23

Google I/O 2024: Shifting AI Strategy and the Road to AGI

Google I/O 2024 revealed a significant shift in the company's AI strategy, emphasizing practical applications and a more confident stance in the AI race. Google is integrating AI across its product ecosystem, notably in Search with "AI mode" and the widespread adoption of Gemini. Despite a focus on product integration, discussions with Demis Hassabis highlight Google DeepMind's continued pursuit of AGI, viewing current advancements as building blocks for future generalized intelligence while acknowledging challenges in productizing rapidly evolving AI capabilities.

google-io-2024ai-conferencesagi-timelinesai-ethicsai-product-developmentgoogle-deepmindpersonal-ai

“Google is aggressively integrating AI into all its products, positioning AI as a core component of future user experience.”

youtube / demishassabis / May 21 / failed

DeepMind CEO Demis Hassabis + Google Co-Founder Sergey Brin: AGI by 2030? - Alex Kantrowitz

youtube / demishassabis / May 21 / failed

Demis Hassabis and Veritasium's Derek Muller talk AI, AlphaFold and human intelligence | The Thinking Game Film

paper / demishassabis / Mar 25

Gemma 3: Multimodal, Efficient, and Scalable Language Models

Gemma 3 introduces a multimodal architecture with integrated vision understanding, expanded language support, and significantly longer context windows (up to 128K tokens). Architectural changes, specifically in attention mechanisms, optimize KV-cache memory usage. The models achieve superior performance over Gemma 2 through distillation and a refined post-training recipe, making them competitive with larger, state-of-the-art models like Gemini-1.5-Pro.

gemma-3multimodal-modelsllm-architecturemodel-distillationlong-contextopen-models

“Gemma 3 models are multimodal, incorporating vision understanding capabilities.”

youtube / demishassabis / Feb 28

Demis Hassabis on AI-Driven Scientific Advancement and Personal Philosophy

Demis Hassabis, drawing from a lifelong engagement with games and AI, advocates for AI as the ultimate tool for scientific discovery. He asserts that AI's pattern recognition and data insight capabilities can profoundly advance fields like biology, exemplified by AlphaFold's impact on protein folding. Hassabis emphasizes the importance of tackling ambitious problems at the opportune moment and embracing interdisciplinary collaboration to unlock significant breakthroughs.

ai-developmentprotein-foldingscientific-advancementdeepmindagi-researchinterdisciplinary-researchethics-of-ai

“AI serves as the most powerful tool for scientific progress.”

youtube / demishassabis / Jan 17

AlphaFold: A Case Study in AI-Driven Scientific Discovery

Demis Hassabis, CEO of Google DeepMind, discusses the journey from AI in games to its application in scientific grand challenges, exemplified by AlphaFold. He highlights how DeepMind^{\prime}s approach of leveraging self-learning in combinatorial search spaces, initially perfected in games like Go, was successfully adapted to solve complex problems in structural biology, particularly protein folding. This success, marked by AlphaFold^{\prime}s atomic-level accuracy and subsequent open-sourcing, signals a new era of "digital biology" and AI-accelerated scientific discovery across various fields.

ai-for-sciencealphafolddeepmindprotein-foldingartificial-general-intelligencedrug-discoverycomputational-biology

“DeepMind's AI strategy, honed in games, is transferable to complex scientific problems.”

paper / demishassabis / Sep 12

AlphaProteo Achieves 3-300x Higher Binding Affinities in De Novo Protein Binder Design

AlphaProteo, a new family of ML models, enables de novo design of protein binders with 3- to 300-fold improved affinities over prior methods across seven targets. It delivers higher experimental success rates, allowing ready-to-use binders via one round of medium-throughput screening without optimization. This addresses the challenge of on-demand high-affinity binder generation for biomedical applications.

alphaproteoprotein-designde-novo-bindersprotein-bindingmachine-learningbiomoleculesalphafold

“AlphaProteo achieves 3- to 300-fold better binding affinities than existing methods on seven target proteins”

youtube / demishassabis / Aug 14

Demis Hassabis on the Future of AI: From General Intelligence to Societal Impact

Demis Hassabis discusses the current state and future trajectory of AI, emphasizing Google DeepMind's role in developing advanced AI models like Gemini and Project Astra. He highlights the distinction between near-term overhype and long-term underappreciation of AGI's transformative potential, advocating for careful development and international cooperation to mitigate risks and ensure beneficial outcomes for humanity.

ai-policy-and-safetyagi-alignmentdeepmind-innovationllm-developmentmultimodal-aiai-applicationsai-ethics-and-regulation

“Google DeepMind is central to Google's AI strategy, integrating research from DeepMind and Google Brain to develop advanced AI models.”

paper / demishassabis / Aug 13 / failed

Imagen 3: Google's Latent Diffusion Model Outperforms SOTA in Text-to-Image Generation

Imagen 3 is a latent diffusion model from Google that produces high-quality images from text prompts. It surpasses other state-of-the-art models in blind preference evaluations conducted at the time of assessment. The work includes detailed quality evaluations alongside safety and representation analyses with mitigations to reduce potential harms.

imagen-3text-to-imagelatent-diffusiongoogle-aiimage-generationai-safety

“Imagen 3 generates high quality images from text prompts”

paper / demishassabis / Jul 31

Gemma 2 Achieves SOTA Performance in 2B-27B Scale via Architectural Tweaks and Distillation

Gemma 2 introduces lightweight open models from 2B to 27B parameters, outperforming peers and rivaling models 2-3x larger. Key enhancements include interleaving local-global attentions and group-query attention in the Transformer architecture. The 2B and 9B variants use knowledge distillation over next-token prediction for training. All models are released openly to the community.

gemma-2open-language-modelstransformer-architectureknowledge-distillationgroup-query-attentionlightweight-llmsarxiv-paper

“Gemma 2 models range from 2 billion to 27 billion parameters”

youtube / demishassabis / Jun 7

Demis Hassabis on the Future of AI

Demis Hassabis discusses the evolution and future of AI, emphasizing the journey of DeepMind from its inception when AI was not a mainstream topic to its current status at the forefront of global AI research. He highlights the strategic milestones, breakthroughs in areas like protein folding and mathematical problem-solving, and the ongoing development of multimodal AI agents. Hassabis also addresses critical challenges, including the need for advanced planning and reasoning capabilities in AI, responsible development, and the UK's role in fostering deep tech innovation.

ai-safetyai-research-and-developmentdeepmind-innovationsuk-tech-ecosystementrepreneurship-in-aiagi-progressai-ethics-and-governance

“AI development requires breakthroughs beyond mere scale, particularly in planning and action capabilities.”

paper / demishassabis / Apr 29

Med-Gemini Achieves State-of-the-Art in 10 Medical Benchmarks, Outperforming GPT-4

Med-Gemini, a specialized multimodal family of Gemini models for medicine, integrates web search and custom encoders for novel modalities. It sets new SoTA on 10 of 14 medical benchmarks, surpassing GPT-4 across all comparable tasks with margins up to 44.5% relatively on multimodal benchmarks like NEJM Image Challenges. Long-context capabilities enable SoTA in health record retrieval and video QA via in-context learning alone, indicating utility in summarization, dialogue, research, and education.

med-geminigemini-modelsmedical-aimultimodal-aihealthcare-benchmarksarxiv-paper

“Med-Gemini achieves state-of-the-art performance on 10 out of 14 medical benchmarks.”

youtube / demishassabis / Apr 29 / failed

How AI Is Unlocking the Secrets of Nature and the Universe | Demis Hassabis | TED

paper / demishassabis / Apr 11

RecurrentGemma Leverages Griffin Architecture for Transformer-Free Efficiency in Open Language Models

RecurrentGemma introduces Google's Griffin architecture, combining linear recurrences with local attention to deliver strong language modeling performance. It maintains a fixed-sized state, minimizing memory usage and enabling efficient long-sequence inference without transformers. The 2B and 9B parameter models, available in pre-trained and instruction-tuned variants, match Gemma baselines despite training on fewer tokens.

recurrentgemmagriffin-architecturerecurrent-language-modelsefficient-inferenceopen-language-modelsbeyond-transformersgoogle-deepmind

“Griffin architecture combines linear recurrences with local attention”

paper / demishassabis / Mar 13

Gemma: Lightweight Open Models Rivaling Proprietary Tech with Strong Benchmarks and Safety Focus

Gemma family consists of 2B and 7B parameter open models derived from Gemini's research and technology, delivering state-of-the-art performance in language understanding, reasoning, and safety. These models outperform comparable open models on 11 of 18 text-based tasks. Pretrained and fine-tuned checkpoints are released alongside comprehensive safety evaluations to advance responsible LLM development.

gemma-modelopen-llmsgoogle-deepmindlightweight-llmsmodel-safetyarxiv-papergemini-technology

“Gemma models have 2 billion and 7 billion parameters”

paper / demishassabis / Mar 8

Gemini 1.5 Achieves Near-Perfect Recall and Reasoning Over 10M Token Contexts Across Modalities

Gemini 1.5 Pro and Flash models process millions of tokens, including long documents, hours of video, and audio, with near-perfect (>99%) retrieval up to 10M tokens and continued next-token prediction gains. They surpass prior versions and rivals like Claude 3.0 (200k) and GPT-4 Turbo (128k) in long-context retrieval, QA, and ASR tasks while matching or exceeding Gemini 1.0 Ultra on broad benchmarks. Real-world applications show 26-75% time savings in professional tasks and rapid learning of rare languages from grammar manuals.

gemini-1.5multimodal-modelslong-contextlarge-language-modelsgoogle-aiarxiv-paper

“Gemini 1.5 models achieve near-perfect recall (>99%) on long-context retrieval tasks up to at least 10M tokens”

youtube / demishassabis / Feb 28

Demis Hassabis on the Future of AGI and AI Development

Demis Hassabis, CEO of DeepMind, discusses the rapid advancements in AI, predicting AGI-like systems within a decade. He emphasizes the importance of multimodal models for understanding the real world, the necessity of combining large models with planning mechanisms, and the critical role of responsible development and safety in achieving beneficial AI.

ai-safetyagi-timelinemultimodal-aireinforcement-learningneuroscience-inspirationai-ethicsmodel-scaling

“AGI-like systems are plausible within the next decade.”

paper / demishassabis / Dec 19

Gemini Ultra Sets New Multimodal AI Benchmarks, Achieving Human-Expert MMLU Performance

Google's Gemini family introduces Ultra, Pro, and Nano multimodal models natively handling text, images, audio, and video. Gemini Ultra leads 30 of 32 benchmarks, marking the first model to reach human-expert levels on MMLU and topping all 20 multimodal benchmarks evaluated. The models support scalable deployment from complex reasoning to on-device applications, with responsible post-training emphasized.

gemini-modelmultimodal-modelsgoogle-aiarxiv-paperstate-of-the-artcross-modal-reasoningllm-benchmarks

“Gemini models process image, audio, video, and text inputs multimodally.”

paper / demishassabis / Oct 25

AlphaZero Extracts Novel, Human-Learnable Chess Concepts Beyond Existing Knowledge

Researchers developed a method to extract novel chess concepts from AlphaZero, a self-play trained AI that achieved superhuman performance without human data. Analysis reveals AlphaZero encodes knowledge extending human understanding yet accessible for learning. In a human study, four top grandmasters improved at solving prototype positions embodying these concepts, demonstrating AI's potential to advance human expertise.

alphazeroconcept-discoveryknowledge-transferhuman-ai-interactionchess-aiai-interpretability

“A new method extracts chess concepts from AlphaZero.”

paper / demishassabis / Oct 16

TacticAI: AI Assistant Outperforms Human Football Corner Kick Tactics in Expert Blind Tests

TacticAI is a geometric deep learning system for analyzing and generating football corner kick tactics, developed with Liverpool FC experts. It predicts outcomes like receivers and shots while generating alternative player positions, achieving data efficiency despite scarce gold-standard data. In blind evaluations by Liverpool FC coaches, TacticAI's suggestions were preferred over real tactics 90% of the time and indistinguishable from actual setups.

tacticaifootball-tacticscorner-kicksgeometric-deep-learningmultiagent-systemsliverpool-fcdemis-hassabis

“TacticAI's generated corner kick tactics are preferred over existing real tactics by Liverpool FC domain experts 90% of the time.”

paper / demishassabis / Oct 9

Transformer-Based Recurrent Neural Network Outperforms Conventional Decoders on Google's Surface Code Hardware

A recurrent transformer neural network decodes the surface code, surpassing state-of-the-art algorithmic decoders on real data from Google's Sycamore processor for distance-3 and distance-5 codes. It retains superior performance on simulated data with realistic noise—including cross-talk, leakage, and analog readouts—up to distance 11. The decoder generalizes beyond its 25-cycle training, demonstrating ML's potential to exceed human-designed quantum error correction algorithms.

quantum-error-correctionsurface-codetransformer-decoderquantum-computingmachine-learninggoogle-sycamoredemis-hassabis

“Recurrent transformer decoder outperforms state-of-the-art algorithmic decoders on real-world data from Google's Sycamore for distance-3 and distance-5 surface codes.”

paper / demishassabis / Aug 17

AlphaZero Diversity League Doubles Puzzle-Solving Capacity and Boosts Elo via Specialized Agents

Researchers extend AlphaZero into AZ_db, a latent-conditioned architecture enabling a league of diverse agents trained with behavioral diversity techniques and sub-additive planning for idea selection. AZ_db generates broader chess move ideas, solves twice as many challenging puzzles—including Penrose positions—as baseline AlphaZero, and achieves 50 Elo improvement by specializing agents to openings. Findings indicate diversity bonuses in AI teams mirror human teams for computationally hard tasks.

alphazeroai-diversitychess-aibehavioral-diversitymulti-agent-aireinforcement-learningarxiv-paper

“AZ_db solves twice as many challenging puzzles as AZ”

paper / demishassabis / Sep 28

Targeted Human Judgments and Evidence Provision Boost RLHF for Helpful, Correct, Harmless Dialogue Agents

Sparrow, an information-seeking dialogue agent, uses RLHF with decomposed natural language rules for targeted rater judgments on helpfulness and harmlessness, enabling efficient rule-conditional reward models. It provides source evidence supporting factual claims, validating 78% of responses. Sparrow outperforms prompted LM baselines in preferences, resists adversarial probes (violating rules only 8% of the time), but shows distributional biases despite rule adherence.

ai-alignmentrlhfdialogue-agentshuman-feedbackdeepmindlanguage-modelsharmless-ai

“Sparrow provides evidence from sources that supports its sampled response 78% of the time for factual questions.”

youtube / demishassabis / Jul 1

Demis Hassabis: From Games to AGI, AlphaFold Breakthroughs Chart Path to Simulating Biology and Physics

Demis Hassabis recounts his journey from child chess prodigy and game developer to DeepMind CEO, emphasizing games as ideal benchmarks for AI due to clear metrics, self-play efficiency, and human-level challenges like AlphaGo. AlphaFold 2 solves the 50-year protein folding problem by predicting 3D structures from amino acid sequences in seconds, enabling proteome-scale simulations and accelerating drug discovery via end-to-end deep learning with physics constraints. He envisions scaling to virtual cells, fusion plasma control via RL, quantum simulations, and AGI to probe universe fundamentals like origins of life, where humanity likely stands alone post-great filters such as multicellularity.

demis-hassabisdeepmindalphagoalpha-foldturing-testagireinforcement-learningprotein-folding

“Turing Test is outdated as a formal AGI benchmark; better to evaluate human-level performance across thousands to millions of diverse tasks.”

paper / demishassabis / Jun 30

DeepNash Masters Imperfect-Information Stratego via Model-Free RL, Surpassing Human Experts

DeepNash, a model-free multiagent RL agent, learns Stratego from scratch using self-play and achieves human expert level without search. Stratego's game tree exceeds 10^535 nodes, 10^175 times larger than Go's, with imperfect information and long episodes complicating decisions. The R-NaD algorithm enables convergence to approximate Nash equilibrium by regularizing multiagent dynamics, outperforming prior AI and ranking top-3 against humans on Gravon.

reinforcement-learningmultiagent-learningimperfect-information-gamesstrategodeep-nashself-playnash-equilibrium

“Stratego's game tree has ~10^535 nodes”

paper / demishassabis / Dec 8

Gopher: 280B Parameter LM Excels on Diverse Tasks with Scale-Driven Gains in Comprehension but Limited Reasoning

DeepMind's Gopher, a 280B parameter Transformer LM, achieves SOTA performance across 152 diverse tasks when scaled from tens of millions to billions of parameters. Scaling yields largest improvements in reading comprehension, fact-checking, and toxic language detection, while logical and mathematical reasoning show smaller benefits. The analysis examines training data impacts on bias/toxicity and discusses LM applications to AI safety and harm mitigation.

language-modelsscaling-lawstransformer-modelsgopher-modelmodel-evaluationai-safetydeepmind-research

“Gopher is a 280 billion parameter language model”

paper / demishassabis / Nov 17

AlphaZero Acquires Human Chess Concepts During Self-Training

Probing reveals that AlphaZero's neural network learns representations of human chess concepts as it trains from scratch on chess. The study identifies when and where these concepts emerge in the network via targeted probes across a broad range. Behavioral analysis of opening play, endorsed by Grandmaster Vladimir Kramnik, complements representational insights, with low-level details made publicly available online.

alphazerochess-aineural-network-interpretabilityhuman-knowledge-representationai-probingreinforcement-learning

“AlphaZero neural network acquires human chess knowledge during training on chess.”

paper / demishassabis / Mar 25

Graph Neural Networks Leverage Simulation Data to Boost Scarce Experimental Materials Predictions

Graph Neural Networks trained on large-scale simulated formation energies produce atomistic descriptors that enhance predictions of experimental formation energies for solids. These learned structural features outperform composition-based methods, especially under data scarcity or when extrapolating to novel chemical spaces. The approach bridges abundant simulation data with limited experimental datasets to improve ML-driven materials property prediction.

graph-neural-networksmaterials-scienceproperty-predictionmachine-learningatomistic-modelingformation-energy

“GNN-learned descriptors from simulated formation energies outperform composition-based methods for predicting experimental formation energies.”

paper / demishassabis / Feb 4

Alchemy Benchmark Exposes Fundamental Failures in Meta-Reinforcement Learning

Alchemy introduces a Unity-based 3D benchmark for meta-RL featuring procedurally resampled latent causal structures across episodes, enabling tasks like structure learning, online inference, hypothesis testing, and abstract action sequencing. Evaluation of powerful RL agents reveals specific meta-learning failures, validating its challenge. The benchmark is released publicly with analysis tools and agent trajectories for in-depth research.

meta-reinforcement-learningmeta-rl-benchmarkalchemy-benchmarkreinforcement-learningstructure-learningdeepmind-research

“Past meta-RL benchmarks are either too simple to be interesting or too ill-defined for principled analysis”

paper / demishassabis / Nov 18

Football Analytics: A Bidirectional Catalyst for AI and Sports Science

Football analytics leverages advances in statistical learning, game theory, and computer vision to model individual player behaviors and team coordination, addressing challenges like predictive and prescriptive analysis. This intersection creates a unique microcosm for AI research, enabling innovations such as counterfactual simulations and game-theoretic penalty kick analysis. The domain mutually benefits AI by providing real-world benchmarks and sports by enhancing performance analytics, with extensions to other sports anticipated.

ai-football-analyticsreinforcement-learninggame-theorymultiagent-systemssports-analyticscomputer-visionarxiv-paper

“AI techniques have been applied to football analytics due to increased data collection, computational power, and machine learning advances”

paper / demishassabis / Sep 9

AlphaZero Enables Rapid Assessment and Design of Balanced Chess Variants

AlphaZero learns near-optimal strategies from scratch for chess variants, allowing in silico evaluation of rule changes without human supervision. The study explores nine atomic modifications to chess rules, including comparisons to Fischer Random Chess, revealing novel strategic patterns while maintaining proximity to classical chess. Analytic results show variant-specific piece valuations and higher decisiveness in some variants compared to standard chess, which has high draw rates.

alphazerogame-balancechess-variantsreinforcement-learningai-game-designalphago-team

“AlphaZero learns near-optimal strategies for any chess rule set from scratch without human supervision.”

paper / demishassabis / Jun 25

Single Macaque IT Neurons Encode Interpretable Semantic Factors Disentangled by Unsupervised Beta-VAE

Beta-VAE, an unsupervised generative model, disentangles face images into interpretable latent factors like gender and hair length, revealing strong correspondence with responses of individual macaque inferotemporal (IT) neurons. Unlike high-dimensional distributed codes in supervised classifiers, single IT neurons encode these low-dimensional semantic factors. Face images can be reconstructed from signals of just a few cells, indicating the ventral stream optimizes for semantic disentanglement at the single-unit level.

unsupervised-learningbeta-vaesemantic-disentanglementinferotemporal-neuronsneural-codingvisual-streamdeep-generative-models

“Beta-VAE disentangles sensory data from macaque IT face responses into interpretable latent factors such as gender or hair length.”

paper / demishassabis / Jan 29

MEMO Network Enables Long-Distance Associative Reasoning via Separated Memories and Adaptive Hops

Existing memory-augmented neural networks fail on classic associative inference tasks requiring reasoning over distant relationships across multiple memories, as well as shortest-path tasks. MEMO addresses this by separating stored facts from their constituent items in external memory and using an adaptive retrieval mechanism that supports a variable number of memory hops. MEMO solves these novel long-distance reasoning tasks and matches state-of-the-art on bAbI.

memory-augmented-networksepisodic-memoryreasoning-tasksneural-networksassociative-inferencearxiv-paper

“Current memory-augmented architectures struggle to reason over long-distance associations in associative inference tasks.”

paper / demishassabis / Nov 19

MuZero Masters Complex Games via Learned Model Planning Without Dynamics Knowledge

MuZero integrates tree-based search with a learned model to achieve superhuman performance in Atari, Go, chess, and shogi, without prior knowledge of environment dynamics. The model iteratively predicts rewards, action policies, and value functions—quantities essential for planning. On 57 Atari games, it sets a new state-of-the-art; on board games, it matches AlphaZero's superhuman level despite lacking rules.

muzeromodel-based-planningreinforcement-learningatari-gamesboard-gamesdeepmindsuperhuman-ai

“MuZero achieves superhuman performance in a range of challenging and visually complex domains without knowledge of underlying dynamics.”