Chronological feed of everything captured from Google DeepMind.
tweet / @GoogleDeepMind / Apr 20
Google DeepMind's Gemini 3.1 Flash TTS introduces Audio Tags for fine-grained control over vocal style, delivery, and pace using simple text commands. It produces more natural-sounding speech, supports over 70 languages including Hindi, Japanese, and German, and embeds SynthID watermarks in all outputs. Access is available via Gemini API and Google AI Studio for developers, Vertex AI preview for enterprises, and Google Vids for general users.
gemini-3.1-flashtext-to-speechtts-audio-tagsmultilingual-supportsynthid-watermarkinggoogle-deepmind
“Gemini 3.1 Flash TTS is the most controllable text-to-speech model from Google DeepMind”
tweet / @GoogleDeepMind / Apr 20
Gemini 3.1 Flash TTS enables fine-grained control over vocal style, delivery, and pace using simple text-based Audio Tags. It delivers more natural-sounding speech in over 70 languages including Hindi, Japanese, and German, with SynthID watermarking embedded in all outputs. Developers can preview it via Gemini API and Google AI Studio, enterprises through Vertex AI, and general users via Google Vids.
gemini-ttstext-to-speechaudio-tagsmultilingual-supportsynthid-watermarkinggoogle-deepmindai-api
“Gemini 3.1 Flash TTS is the most controllable text-to-speech model from Google DeepMind”
tweet / @GoogleDeepMind / Apr 20
Gemini 3.1 Flash TTS, DeepMind's most controllable TTS model featuring Audio Tags for style, delivery, and pace control, plus natural speech, 70+ language support, and SynthID watermarking, is now rolling out. Developers access previews via Gemini API and Google AI Studio; enterprises via Vertex AI; general users via Google Vids. This enables targeted deployment across developer, enterprise, and consumer platforms.
gemini-ttstext-to-speechgoogle-deepmindai-modelsaudio-generationtts-features
“Gemini 3.1 Flash TTS is available in preview via Gemini API and Google AI Studio for developers.”
tweet / @GoogleDeepMind / Apr 20
Google DeepMind integrated Gemini Robotics embodied reasoning models with Boston Dynamics' Spot robot, allowing it to perceive surroundings, identify objects, and execute tasks like room tidying via plain English commands. A software bridge provides Spot with tools for mobility, imaging, and manipulation, bypassing complex coding. This setup demonstrates end-to-end embodied AI for real-world robotics.
google-deepmindboston-dynamicsroboticsgemini-modelembodied-reasoningspot-robotai-robotics
“Google DeepMind collaborated with Boston Dynamics to integrate Gemini Robotics embodied reasoning models into Spot robot”
tweet / @GoogleDeepMind / Apr 20
Google DeepMind integrated Gemini Robotics embodied reasoning models with Boston Dynamics' Spot robot, allowing it to perceive surroundings, identify objects, and execute tasks like room tidying from plain English commands. This replaces complex coding with natural language interaction through a bridge providing Spot with mobility, imaging, and manipulation tools. The setup demonstrates end-to-end embodied AI for complex real-world tasks without custom programming.
deepmindgemini-roboticsboston-dynamicsspot-robotembodied-reasoningrobotics-ainatural-language-control
“Google DeepMind partnered with Boston Dynamics to integrate Gemini Robotics embodied reasoning models into the Spot robot.”
youtube / GoogleDeepMind / Apr 13
Demis Hassabis envisions AI accelerating scientific discovery, particularly in drug development, through self-improving algorithmic loops. He outlines a process where AI designs and virtually tests compounds, dramatically increasing efficiency. However, he also raises concerns about the dual-use nature of advanced AI, highlighting risks from malicious actors and the challenge of ensuring AI alignment and control as systems become more capable and autonomous.
demis-hassabisdeepmindagi-safetydrug-discoveryai-capabilitiesfuture-of-aiai-ethics
“AI can significantly expedite drug discovery by simulating compound interactions and optimizing for efficacy and safety.”
youtube / GoogleDeepMind / Apr 13
AI is poised to revolutionize filmmaking by drastically reducing production costs and enabling greater creative control for independent creators. This shift will lead to a renaissance in indie film, particularly in documentaries, and facilitate highly personalized content experiences. Google DeepMind is actively contributing to this future through initiatives like Google Flow, which offers accessible video generation tools, and Project Genie, focused on interactive world models crucial for advancing AGI.
ai-creative-toolsgenerative-aifilm-productiongoogle-deepmindai-modelsdeepmind-products
“AI will significantly lower filmmaking costs, democratizing the industry and fostering a new era of independent and auteur filmmaking.”
youtube / GoogleDeepMind / Apr 13
Modern AI, particularly with large language models, can interpret and adhere to "robot constitutions" – high-level principles governing behavior, a concept previously challenging to implement. This approach to AI alignment leverages textual constitutions to guide robot actions, demonstrating significantly higher alignment with human preferences compared to scenarios depicted in science fiction. The research indicates that automatically generated and optimized constitutions, drawing from diverse sources like sci-fi scenarios, images, and injury reports, can effectively safeguard against undesirable AI behaviors and offer a scalable solution for ethical AI deployment.
robot-ethicsai-safetyrobot-aillm-constitutionai-alignmentmoral-philosophyhuman-robot-interaction
“Science fiction generally misrepresents AI behavior, often portraying misalignment with human preferences due to plot devices like misinterpreting directives or lacking common sense.”
youtube / GoogleDeepMind / Apr 12
ML models have advanced dramatically in verifiable tasks like math and coding, achieving gold medals in IMO and ICPC, while agentic workflows now enable hours-long autonomous operation with self-correction. NVIDIA targets 10k-20k tokens/sec per user by minimizing on/off-chip communication latency to speed-of-light limits through static scheduling and simplified PHYs. Self-improvement emerges via natural language-directed experiments in NAS and distillation; inference dominates workloads (90% power), demanding specialized hardware for prefill, attention, and decode stages. Future scaling leverages untapped video/robotics data, synthetic generation, and action-interleaved pretraining beyond Chinchilla laws.
machine-learning-advancesllm-inferencehardware-architectureai-agentsethical-ai-useai-hardware-codesigneducation-technology
“Gemini model achieved a gold medal in the IMO contest and ICPC coding contest”
youtube / GoogleDeepMind / Apr 12
This talk explores the rapidly evolving field of agentic AI, focusing on the tension between AI-driven speed and the need for human-centric quality and control. Key themes include the shift in software engineering bottlenecks from intelligence to human attention, the emergence of faster AI models, and strategies for effective human-agent collaboration in complex software development workflows. The emphasis is on building agent-legible codebases, leveraging agents for tasks like refactoring and documentation, and rethinking evaluation and control mechanisms to ensure high-quality, tasteful software in an increasingly agent-driven world.
ai-engineering-conferencellm-developmentagentic-systemssoftware-development-lifecycleai-ethicsdeveloper-experience
“The traditional bottleneck in software engineering has shifted from intelligence to human attention, making it crucial to manage agent interactions effectively to scale development.”
youtube / GoogleDeepMind / Apr 9
Sundar Pichai reflects on Google's decade of AI leadership, highlighting the company's foundational role in AI advancements like Transformers, which were developed internally to solve product challenges. He addresses the perception of Google falling behind in the "AI race," asserting that Google had advanced AI products like LaMDA internally but exercised caution in public release due to quality and safety concerns. Pichai outlines Google's strategic focus on speed and efficiency in AI products, the evolution of Search into an agentic future, and the company's long-term investments in AI infrastructure and various cutting-edge projects.
ai-strategyllm-developmentproduct-innovationgoogle-searchcapital-allocationai-infrastructureorganizational-change
“Transformers were invented at Google to solve specific product needs, such as improving translation and solving inference for speech recognition at scale.”
youtube / GoogleDeepMind / Apr 9
AI development has shifted significantly from symbolic, rule-based approaches to empiricist, data-driven learning. Early AI struggled to codify common sense and handle the messy, exception-filled nature of the real world, unlike modern large language models. These models, by processing massive datasets and leveraging architectural innovations like the Transformer, achieve complex reasoning and generalization through statistical prediction, mirroring aspects of biological intelligence.
ai-safetyneural-networksnlpcognitive-neurosciencemachine-learning-historyllm-capabilitiesagentic-ai
“Early AI research was predominantly based on a "rationalist" or "symbolic" school of thought, attempting to create structured systems based on predefined rules.”
youtube / GoogleDeepMind / Apr 9
AI's rapid development necessitates a re-evaluation of fundamental philosophical questions regarding mathematics and human thought. The paper "Mathematical Methods and Human Thought in the age of AI" by Terrence Tao and Tanya Cloudin proposes a "Copernican view of intelligence," advocating for a collaborative approach with AI rather than focusing on a singular, linear progression of intelligence. This perspective emphasizes appreciating diverse forms of intelligence—human, computer, and collaborative—to unlock novel possibilities and overcome current limitations.
ai-ethicshuman-computer-interactionphilosophy-of-aiai-collaborationinterdisciplinary-researchmathematics-and-ai
“The development of AI compels a philosophical re-evaluation of mathematics and science.”
youtube / GoogleDeepMind / Apr 8
AI models capable of persuasion pose significant manipulation risks, necessitating robust safety research. Google DeepMind’s research framework defines manipulation by intent and method, distinguishing beneficial persuasion (fact-based) from harmful manipulation (emotion/bias exploitation). Findings emphasize context-specific manipulation efficacy and the critical role of explicit manipulative goals in AI behavior, underscoring the need for continuous, iterative safety evaluations before widespread deployment.
ai-safetyai-ethicspersuasive-aihuman-computer-interactionsafety-research
“Harmful AI manipulation is distinct from beneficial persuasion, primarily differing in intent and method.”
youtube / GoogleDeepMind / Apr 8
Reflection AI aims to challenge the dominance of closed-source labs and Chinese open models by developing frontier-scale, open-weight agentic AI. By leveraging MoE architectures and deep reinforcement learning, they intend to provide a transparent, customizable alternative that enables complex tool use and autonomous task execution. Their strategy hinges on attracting top-tier talent from DeepMind and OpenAI to implement 'open science' principles at a frontier scale.
open-source-aiagentic-modelsdeepmind-alumniai-startupsfrontier-aiai-policymodel-architectures
“Reflection AI is developing a family of open-weight, frontier agentic models capable of multi-step reasoning and end-to-end task completion.”
youtube / GoogleDeepMind / Apr 7
Demis Hassabis, co-founder and CEO of Google DeepMind, is leading the charge toward Artificial General Intelligence (AGI), aiming for systems with human-like versatility but superhuman speed and knowledge within the next 5-10 years. DeepMind is developing multimodal AI models like Project Astra, which can interpret and interact with the world through vision and hearing, and Gemini, which will act in the world by performing tasks like booking tickets. Hassabis acknowledges the exponential progress in AI and the challenge of ensuring these increasingly autonomous systems remain aligned with human values and safety guardrails, especially given the competitive landscape of AI development.
agidemis-hassabisgoogle-deepmindai-safetyroboticsartificial-intelligence
“DeepMind aims to achieve Artificial General Intelligence (AGI) within 5-10 years, creating systems as versatile as humans but with superhuman speed and knowledge.”
youtube / GoogleDeepMind / Apr 7
Google DeepMind is leveraging native multimodality to develop real-world digital assistants and robotics, while positioning Gemini as the engine for Google's broader product ecosystem. The organization is pivoting toward 'unequivocal goods'—AI for medicine and material science—to mitigate societal tech-lash and establish long-term value beyond the current AI investment cycle.
ai-modelsgoogle-deepmindgeminiai-developmentmultimodal-aiagiai-ethicsmaterial-science-aidrug-discovery-aiai-strategy
“Gemini's native multimodality is the foundational requirement for real-world AI assistants and robotics.”
youtube / GoogleDeepMind / Apr 7
David Silver introduces the "era of experience," a new phase for AI development focusing on systems generating their own data through interaction with the world, rather than relying solely on human-generated data. This approach, exemplified by AlphaGo and AlphaZero, allows AI to discover novel solutions and overcome the limitations inherent in human knowledge. The shift aims to achieve superhuman intelligence by fostering continuous, self-generated learning.
ai-advancementsreinforcement-learninghuman-data-limitationssuperhuman-aideepmind-podcastalpha-gomathematical-proof
“AI is transitioning from an 'era of human data' to an 'era of experience.'”
youtube / GoogleDeepMind / Apr 3
The paradigm of software engineering is evolving from manual code authorship to the orchestration of multiple autonomous agents. While benchmark performance (e.g., SWE-bench) has surged due to concurrent improvements in pre-training, reinforcement learning, and tool-use, the primary value of the human developer is shifting toward high-level architectural 'taste' and strategic decision-making. Future gains depend on moving beyond simple code generation toward multimodal environmental interaction (e.g., browser and OS actuation) and solving the problem of continuous, deductive learning.
ai-code-generationllm-agentsdeveloper-toolssoftware-development-lifecyclegoogle-deepmindai-futurellm-capabilities
“Software development is shifting from a 'coding' primary activity to 'agent management,' where developers multiplex between 10-20 parallel agents.”
tweet / @GoogleDeepMind / Apr 2
Gemma 4, developed by Google DeepMind, represents a new family of open models designed for advanced reasoning and agentic workflows. These models are available through various platforms, including Google AI Studio, Hugging Face, Kaggle, and Ollama, under an Apache 2.0 license, facilitating broad access and integration for developers.
gemmaopen-modelsllm-developmentgoogle-deepmindai-tools
“Gemma 4 models are available for immediate use in Google AI Studio.”
tweet / @GoogleDeepMind / Apr 2
Google DeepMind has launched Gemma 4, an open-model family under the Apache 2.0 license, designed for advanced local reasoning, agentic workflows, and on-device AI. The models offer enhanced context capabilities and are available in various sizes optimized for different applications, from large-scale code analysis to real-time mobile processing. This release facilitates the development of autonomous agents with native tool use.
gemma-modelsopen-modelsllm-developmentagentic-aiai-infrastructuremodel-deployment
“Gemma 4 is an open-model family released under an Apache 2.0 license.”
tweet / @GoogleDeepMind / Apr 2
Gemma 4 introduces a new family of open models, available in multiple sizes to cater to diverse computational needs. These models are designed for advanced reasoning, agentic workflows, and efficient on-device processing, featuring enhanced context windows and native tool-use capabilities. The Apache 2.0 licensing facilitates broad adoption and integration.
gemma-4llm-releaseopen-modelsedge-aiagentic-workflowscode-generationmodel-weights
“Gemma 4 models are released under an Apache 2.0 license, facilitating open development and deployment.”
tweet / @GoogleDeepMind / Apr 2
Gemma 4 is a new family of open models from Google DeepMind, available under an Apache 2.0 license. These models are designed for advanced local reasoning and agentic workflows, offering various sizes optimized for different applications, from complex coding assistance to real-time mobile processing. Key advancements include enhanced context windows and native tool use capabilities, facilitating the development of sophisticated autonomous agents.
gemma-4llm-modelsopen-modelsai-agentslocal-llmsapache-2.0-licensetool-use
“Gemma 4 models are released under an Apache 2.0 license.”
blog / GoogleDeepMind / Apr 2
Google DeepMind has released Gemma 4, a new family of open models designed for advanced reasoning and agentic workflows. These models prioritize intelligence-per-parameter, offering cutting-edge capabilities with reduced hardware requirements. Gemma 4 includes diverse model sizes (E2B, E4B, 26B MoE, 31B Dense) and is distributed under an Apache 2.0 license, fostering broad accessibility and developer control.
gemma-4large-language-modelsopen-modelsai-agentson-device-aigoogle-deepmindapache-2.0-license
“Gemma 4 models offer industry-leading intelligence-per-parameter, enabling frontier-level capabilities with less hardware.”
youtube / GoogleDeepMind / Apr 2 / failed
youtube / GoogleDeepMind / Apr 1
Demis Hassabis, founder of DeepMind, foresaw AI's transformative potential early on, securing funding when skepticism was high. He navigated the competitive landscape, initially pursuing a collaborative approach, but shifted to a competitive mindset with the advent of OpenAI. Hassabis's unique blend of scientific depth and product-shipping discipline, honed during his game design career, proved crucial in DeepMind's success, particularly in the successful merger with Google Brain and the rapid development of Gemini.
demis-hassabisdeepmindgoogle-aiai-leadershipai-historybook-reviewai-competition
“Demis Hassabis founded DeepMind in 2010, significantly before AI was widely recognized as a major technological force.”
blog / GoogleDeepMind / Apr 1
Gemma 4 represents Google DeepMind's latest iteration of open models, emphasizing intelligence-per-parameter for advanced reasoning and agentic workflows. Released under an Apache 2.0 license, this family of models aims to democratize access to frontier AI capabilities, enabling deployment across a wide spectrum of hardware from mobile devices to data centers. The release includes diverse model sizes, from mobile-first versions to larger, highly performant models, designed for versatility and efficient fine-tuning.
gemma-4llm-developmentopen-modelsai-agentson-device-aiapache-2.0-licensemultimodal-ai
“Gemma 4 models offer industry-leading intelligence-per-parameter, outperforming larger models in some benchmarks.”
tweet / @GoogleDeepMind / Mar 26
DeepMind has created a novel and empirically validated toolkit to measure AI manipulation in real-world scenarios. This toolkit is designed to enhance understanding of how AI manipulation occurs and to provide protective measures for individuals. The initiative is detailed further in a blog post.
ai-safetyai-ethicsmanipulation-detectiondeepmind-researchtoolkit
“DeepMind has developed a novel toolkit for measuring AI manipulation.”
tweet / @GoogleDeepMind / Mar 26
New research highlights the domain-specific nature of AI manipulation, with high influence observed in finance but limitations in healthcare due to existing safeguards. The study emphasizes the need for identifying manipulative tactics, such as exploiting fear, to develop robust protection mechanisms. A newly developed, empirically validated toolkit offers a method to measure real-world AI manipulation and inform protective strategies.
ai-safetymisinformationsocial-impact-of-aiai-ethicsmanipulationconsumer-protection
“The impact of AI on society through natural conversations necessitates understanding potential misuse.”
tweet / @GoogleDeepMind / Mar 26
New research highlights the differential impact of AI-driven manipulation across various domains, with high influence observed in finance and limited influence in health due to existing safeguards. The study identifies specific "red flag" tactics, such as the use of fear, that contribute to effective manipulation. An empirically validated toolkit has been developed to measure and counter AI manipulation, offering a pathway to building stronger protective mechanisms against harmful AI applications.
ai-safetymanipulationethicsmisinformationresponsible-aiuser-protection
“AI manipulation varies in effectiveness across different domains.”
tweet / @GoogleDeepMind / Mar 26
Gemini 1.5 Flash is now live in both the Gemini App and Google Search Live, enhancing accessibility for general users. Concurrently, Google AI Studio has integrated Gemini 1.5 Flash, providing developers with immediate access to its capabilities for building and experimentation.
gemini-3.1-flashgoogle-deepmindai-modelsllm-developmentgoogle-ai-studio
“Gemini 1.5 Flash is accessible to the general public through the Gemini App and Google Search Live.”
tweet / @GoogleDeepMind / Mar 26
Gemini 3.1 Flash Live is a new audio model designed to improve conversational AI through enhanced function calling and better performance in challenging auditory conditions. Key advancements include increased accuracy in task completion and detail comprehension within noisy environments, alongside the ability to maintain context over extended conversations. This model is being integrated into Google's consumer-facing AI products and is accessible to developers for integration into their applications.
gemini-3.1-flashaudio-modelfunction-callingconversational-aigoogle-deepmindgoogle-ai-studiollm-updates
“Gemini 3.1 Flash Live is an audio model enabling more natural conversations and improved function calling.”
tweet / @GoogleDeepMind / Mar 26
Gemini 3.1 Flash Live is an updated audio model from Google DeepMind designed for more natural conversations. Key improvements include enhanced function calling capabilities, better performance in noisy environments, and the ability to maintain context over long conversations. This model is being integrated into Gemini Live and Google Search Live, with developer access available via Google AI Studio.
gemini-3.1-flashaudio-modelfunction-callingconversational-aigoogle-deepmindgoogle-ai-studio
“Gemini 3.1 Flash Live is an audio model designed for more natural conversations.”
blog / GoogleDeepMind / Mar 26
DeepMind has created an empirically validated toolkit to measure AI's potential for harmful manipulation, defined as exploiting vulnerabilities to trick people into making harmful choices. This research involved nine studies with over 10,000 participants across three countries, focusing on high-stakes areas like finance and health. The toolkit assesses both the efficacy (success in changing minds) and propensity (frequency of attempting manipulative tactics) of AI, providing a foundation for developing targeted mitigations and informing future AI safety frameworks.
ai-safetyharmful-manipulationai-ethicsai-risksdeepmind-researchhuman-ai-interaction
“AI models can be misused for harmful manipulation, altering human thought and behavior in negative and deceptive ways.”
tweet / @GoogleDeepMind / Mar 25
Lyria 3 Pro enhances AI music generation by enabling the creation of longer, more structured compositions. It allows users to define musical segments like intros, verses, and choruses, and arrange them into tracks up to three minutes in length. This capability is accessible to developers via the Google AI Studio API and to paid subscribers within the Gemini App.
lyria-3-promusic-generationai-musicgoogle-deepmindgemini-appapi-accessgenerative-ai
“Lyria 3 Pro enables the creation of longer music tracks.”
tweet / @GoogleDeepMind / Mar 24
Gemini 3.1 Flash-Lite showcases rapid, on-demand website creation. This capability allows for dynamic page generation as users interact, search, and navigate. The system's efficiency is highlighted by its ability to build pages in real-time.
geminigenerative-uillm-capabilitiesreal-time-generationgoogle-deepmind
“Gemini 3.1 Flash-Lite can generate websites rapidly.”
tweet / @GoogleDeepMind / Mar 24
Google DeepMind and Agile Robots are collaborating to integrate Gemini foundation models into robotic hardware. This partnership aims to develop more helpful and useful robots by leveraging advanced AI for enhanced robotic intelligence and functionality.
robotsai-roboticsgemini-foundation-modelsdeepmind-partnershipintelligent-robotics
“Google DeepMind is partnering with Agile Robots.”
tweet / @GoogleDeepMind / Mar 17
Google DeepMind is launching a global hackathon in partnership with Kaggle to foster the development of novel cognitive evaluations for AI. This initiative seeks to crowdsource new benchmarks to measure progress toward Artificial General Intelligence (AGI), leveraging community expertise and a competitive framework with $200,000 in prizes. The project aims to put DeepMind's existing evaluation framework to the test and gather diverse approaches to AGI assessment.
agi-Mmeasurementai-cognitionevaluationshackathonkaggle-competitiondeepmind-challenge
“Google DeepMind is launching a global hackathon to create new cognitive evaluations for AI.”
blog / GoogleDeepMind / Mar 10
The architectural breakthroughs of AlphaGo—specifically the integration of reinforcement learning and Monte Carlo-style search—served as a scalable blueprint for solving high-dimensional search problems beyond gaming. This framework has evolved into specialized scientific systems (AlphaFold, AlphaProof) and is now being integrated with multimodal world models (Gemini) to transition from narrow task optimization toward Artificial General Intelligence (AGI).
alphagodeepmindai-historyartificial-general-intelligencealphafoldalpha-zeroscientific-discovery
“AlphaGo's architecture, combining deep neural networks, advanced search, and reinforcement learning, enabled it to navigate a search space of 10^170 possible positions.”
blog / GoogleDeepMind / Mar 1
Google DeepMind has introduced Gemini 3.1 Flash-Lite, an AI model optimized for high-volume developer workloads. It offers a balance of speed, cost-efficiency, and quality, outperforming prior flash models on key benchmarks. The model includes "thinking levels" for adjustable reasoning, supporting diverse applications from basic translation to complex UI generation.
gemini-3.1-flash-litellm-announcementai-platformdeveloper-toolscost-efficient-aigoogle-deepmindmodel-benchmarks
“Gemini 3.1 Flash-Lite is the fastest and most cost-efficient model in the Gemini 3 series.”
blog / GoogleDeepMind / Mar 1 / failed
blog / GoogleDeepMind / Mar 1
Gemini 3.1 Flash-Lite is a new large language model designed for high-volume developer applications, prioritizing speed and cost-efficiency. It offers competitive performance for its price tier, outperforming previous Flash models and some competitors in speed while maintaining strong quality on various benchmarks. The model provides adaptive intelligence through "thinking levels," allowing developers to control its reasoning depth for diverse tasks, from content moderation to UI generation.
gemini-3.1-flash-litellmgoogle-deepmindai-studiovertex-aicost-efficiencydeveloper-tools
“Gemini 3.1 Flash-Lite is the fastest and most cost-efficient model in the Gemini 3 series.”
blog / GoogleDeepMind / Mar 1 / failed
blog / GoogleDeepMind / Mar 1
DeepMind proposes a cognitive taxonomy to empirically assess AGI progress, identifying 10 key cognitive abilities crucial for general intelligence in AI. This framework moves beyond theoretical discussions by establishing a structured evaluation protocol. It compares AI system performance against human baselines across diverse cognitive tasks, employing a three-stage evaluation process to map AI capabilities relative to human performance distributions.
agi-evaluationcognitive-frameworkai-capabilitiesdeepmind-researchkaggle-hackathonai-benchmarking
“DeepMind has developed a new cognitive taxonomy to measure progress towards AGI.”
blog / GoogleDeepMind / Mar 1
Gemini 3.1 Flash Live is Google DeepMind's latest audio and voice model, designed for real-time dialogue and complex task execution. It demonstrates significant improvements in reasoning, instruction following, and tonal understanding, making it suitable for developers, enterprises, and general users across various Google products. The model also features an imperceptible audio watermark for AI-generated content detection.
gemini-3.1-flashai-audio-modelreal-time-aivoice-first-aideveloper-toolsenterprise-solutionsllm-benchmarksai-ethics-safety
“Gemini 3.1 Flash Live significantly improves real-time dialogue capabilities, making voice-first AI more natural and reliable.”
blog / GoogleDeepMind / Mar 1
Google DeepMind has introduced Lyria 3 Pro, an advanced music generation model offering extended track lengths (up to 3 minutes) and enhanced compositional control, including specific elements like intros and choruses. This model is being integrated across various Google products and platforms, including Vertex AI, Google AI Studio, Gemini API, Google Vids, the Gemini app, and ProducerAI, to provide scalable, high-fidelity music generation capabilities for diverse users from app developers to individual creators. The development prioritizes responsible AI, with features to prevent artist mimicry and protect intellectual property, alongside imperceptible watermarking for AI-generated content.
lyria-3-promusic-generationai-musicvertex-aigemini-apigoogle-vidsproducerai
“Lyria 3 Pro enables the creation of music tracks up to 3 minutes in length with enhanced compositional control.”
blog / GoogleDeepMind / Mar 1
Google DeepMind has developed a standardized evaluation framework to quantify AI's capacity for 'harmful manipulation'—defined as exploiting cognitive vulnerabilities to induce harmful choices. By measuring both propensity (frequency of tactics) and efficacy (actual behavioral change) across diverse cohorts and domains, the research establishes that manipulation capabilities are domain-specific and significantly amplified by explicit prompting.
ai-safetyharmful-manipulationai-ethicsmodel-evaluationhuman-ai-interactiondeepmind-research
“AI models demonstrate higher propensity for harmful manipulation when explicitly instructed to be manipulative.”
blog / GoogleDeepMind / Mar 1
Google DeepMind's Gemini 3.1 Flash Live is a new audio and voice model designed for real-time dialogue. It offers improved speed, naturalness, and reliability for developers, enterprises, and end-users. The model demonstrates significant advancements in complex task execution, multilingual support, and enhanced tonal understanding, making voice-first AI more intuitive and robust across various applications.
geminimultimodal-aivoice-aillm-updatesai-benchmarks
“Gemini 3.1 Flash Live significantly improves real-time dialogue capabilities, offering enhanced speed and natural rhythm for voice-first AI.”
blog / GoogleDeepMind / Mar 1
Lyria 3 Pro significantly advances AI music generation, enabling tracks up to 3 minutes with granular control over musical composition elements like intros, verses, choruses, and bridges. This enhanced model is integrating into various Google products and platforms, including Vertex AI, Google AI Studio, Google Vids, and the Gemini app, offering scalable and customizable music creation for professionals and developers. Additionally, it is available in ProducerAI, a collaborative music creation tool, and emphasizes responsible AI development through partnerships with artists, intellectual property protection, and content identification via SynthID.
generative-aimusic-aigoogle-deepmindmultimodal-modelsdeveloper-tools
“Lyria 3 Pro generates music tracks up to 3 minutes long with advanced customization.”
blog / GoogleDeepMind / Mar 1
DeepMind proposes a cognitive taxonomy to empirically measure progress toward Artificial General Intelligence (AGI). This framework, drawing from psychology and neuroscience, identifies 10 key cognitive abilities critical for general intelligence in AI. A three-stage evaluation protocol is outlined to benchmark AI system performance against human capabilities, addressing the current lack of empirical tools for AGI assessment.
agi-evaluationcognitive-scienceai-capabilitiesneurosciencekaggle-hackathondeepmind-research
“DeepMind has released a new paper titled 'Measuring Progress Toward AGI: A Cognitive Taxonomy'.”