absorb.md — A knowledge graph of what AI thinkers are actually saying

Doppel, an AI-native cybersecurity platform, leveraged Modal to overcome ML infrastructure bottlenecks in both training and inference. By enabling parallel experimentation and simplifying deployment, Modal significantly reduced operational overhead, allowing Doppel's small ML team to iterate faster on critical detection models and focus on developing new ideas rather than managing infrastructure.

mlopsmachine-learning-infrastructureserverless-computingai-engineeringmodel-inference

“Doppel's ML workloads require fast experimentation and reliable, scalable inference.”

blog / modal_labs / Apr 12

Modal Introduces Directory Snapshots for Granular Sandbox State Management

Modal's new Directory Snapshots enable fine-grained control over sandbox environments by allowing users to snapshot and restore specific directories. This decouples the lifecycle management of different state layers, such as system dependencies and application code, addressing limitations of full filesystem snapshots. The feature improves startup latency through pre-warming and optimized pre-loading, enhancing developer workflows for sandboxed applications.

cloud-infrastructureserverless-computingsandbox-environmentsdeveloper-toolsllm-infrastructure

“Traditional sandbox providers offer file system snapshots, but these treat the entire sandbox state as a single atomic unit, limiting flexibility.”

blog / modal_labs / Apr 12

Modal enables real-time AI video agents for Runway Characters

Runway has partnered with Modal to provide the real-time inference infrastructure for Runway Characters, an API that generates customizable conversational video agents. This partnership addresses the critical need for low-latency, GPU-intensive compute capable of handling highly variable demand across global regions. Modal's serverless platform allows Runway to quickly scale and distribute its AI workloads, enabling sophisticated real-time video interactions without extensive infrastructure management.

modal-labsrunway-mlreal-time-inferencevideo-agentsserverless-gpugwm-1llm-infrastructure

“Runway Characters is a real-time video agent API that creates customizable conversational characters.”

blog / modal_labs / Apr 12 / failed

Product Updates: RTX Pro 6000 Blackwell, Command K, Sandbox FS API and more

blog / modal_labs / Apr 12 / failed

Real-time inference for robots at Physical Intelligence

blog / modal_labs / Apr 12

Modal Acquires Butter to Enhance AI Agent Sandbox Capabilities

Modal has acquired Butter, integrating its founder Erik Dunteman and researcher Raymond Tana into the Modal Sandbox team. This acquisition aims to leverage Butter's expertise in agent harness engineering, including deterministic memory systems and codegen, to advance Modal's sandbox offerings. The move is expected to enhance the development and capabilities of AI agents within Modal's platform through improved, lightweight, and ephemeral sandboxing solutions.

modal-labsacquisitionllm-deploymentserverless-gpuagent-engineeringcode-generationdeveloper-tools

“Modal has acquired Butter, integrating its team into Modal's Sandbox division.”

youtube / modal_labs / Apr 9

Modal Labs: Revolutionizing Serverless GPU Deployment for AI Inference

Modal Labs has engineered a novel platform to address the inefficiencies inherent in traditional GPU deployments for AI inference. Their solution tackles variable demand and resource allocation challenges by implementing a buffered instance management system, a lazy-loading file system, and GPU snapshotting. This approach drastically reduces cold start times and optimizes GPU utilization, thereby providing a more cost-effective and responsive infrastructure for compute-intensive AI applications.

gpu-infrastructureserverless-computingai-inferencemachine-learning-opscloud-computingperformance-engineering

“Traditional GPU allocation for AI inference is inefficient due to variable demand and overprovisioning.”

youtube / modal_labs / Apr 6 / failed

Modal Labs

Introducing Claude Managed Agents with Modal Sandboxes

One-Second Voice-to-Voice Latency with Modal, Pipecat, and Open Models

Host overhead is killing your inference efficiency

Agents need good developer experience too

How Decagon shipped real-time voice AI on Modal

How Ramp built a full context background coding agent on Modal

Accelerating AI research that accelerates AI research

Introducing Notebooks

10x faster cold starts with GPU snapshotting

Keeping 20000 GPUs healthy

RTX Pro 6000 Blackwell, Command K, Sandbox FS API and more

Real-time inference for robots at Physical Intelligence

Autoscaling Autoresearch: Give your agents elastic GPUs on Modal

Building with Modal and the OpenAI Agents SDK

Boosting multimodal inference performance by >10% with a single Python dictionary

How to achieve truly serverless GPUs

Building an RL Theorem-Proving Workflow on Modal

Boosting multimodal inference performance by >10% with a single Python dictionary

Autoscaling Autoresearch: Give your agents elastic GPUs on Modal

Building with Modal and the OpenAI Agents SDK

The Future of Research is Open: Introducing K-Dense BYOK

Agents need good developer experience too

Keeping 20,000 GPUs healthy

Accelerating AI research that accelerates AI research

Modal Eliminates ML Infrastructure Tax for Doppel, Accelerating AI-Native Cybersecurity

Modal Introduces Directory Snapshots for Granular Sandbox State Management

Modal enables real-time AI video agents for Runway Characters

Product Updates: RTX Pro 6000 Blackwell, Command K, Sandbox FS API and more

Real-time inference for robots at Physical Intelligence

Modal Acquires Butter to Enhance AI Agent Sandbox Capabilities

Modal Labs: Revolutionizing Serverless GPU Deployment for AI Inference

Scaling AI Agent Infrastructure I Ramp, Modal & Turbopuffer