absorb.md

AI at Meta

Chronological feed of everything captured from AI at Meta.

Meta's AI Infrastructure Bet: Liquid Cooling, Custom Silicon, and the End of Commodity Data Centers

Meta's VP of Infrastructure Dan Rabinovich outlines a fundamental shift in data center design driven by AI workloads — rack thermal density is scaling from ~30 kW to 500–700 kW, forcing a transition from air to full-facility liquid cooling. Meta's in-house AI accelerator program (MTIA) is not primarily cost-driven but aimed at co-designing hardware/software for high-value internal workloads like ads ranking and recommendation, where workload-specific optimization yields superior performance-per-TCO. At the semiconductor level, Dennard scaling is effectively dead, shifting the competitive frontier to advanced packaging (chiplets, CoWoS, silicon-on-wafer), which introduces new yield, toolchain, and manufacturing cycle-time challenges at scale.

Meta's Custom Silicon for Video Transcoding: MSVP Scales Encoding Across Billions of Videos

Meta has developed MSVP (Meta Scalable Video Processor), a custom hardware accelerator purpose-built to handle the full video transcoding pipeline — decode, resize, and multi-format encode — at the scale demanded by Facebook, Instagram, and Messenger. MSVP outperforms traditional software encoders in throughput and quality, and is the first in the industry to embed objective quality metric computation directly in hardware, scoring every encode at scale. As generative AI, AR, and VR content creation accelerates, MSVP is positioned as a foundational infrastructure block for delivering that content to end users.

Meta's MTIA: Why Custom Silicon Beats GPUs for AI at Hyperscale

Meta has developed MTIA (Meta Training and Inference Accelerators), a family of custom ASICs purpose-built for its internal AI and ML workloads, including ads ranking and recommendation systems. Unlike off-the-shelf GPUs, MTIA is co-designed with Meta's actual production workloads, enabling tighter silicon-software integration under the PyTorch stack and faster model deployment cycles. The vertical integration approach — spanning chip, board, rack, data center, firmware, compiler, and application runtimes — allows Meta to eliminate architectural waste, reduce power consumption, and maintain roadmap alignment with evolving AI workload demands.

Meta's Research SuperCluster: How Massive GPU Infrastructure Accelerates Frontier AI Training

Meta's Research SuperCluster (RSC) combines latest-generation compute, high-speed interconnects, and fast storage to dramatically compress AI training timelines. The system enables researchers to elastically scale workloads from 8 to 8,000 GPUs, turning multi-month training runs into days. RSC's practical impact is demonstrated by the No Language Left Behind (NLLB-200) project, where a 200-language translation model was trained in ~10 days rather than months. The infrastructure is positioned as a strategic lever for Meta to iterate faster and compete at the frontier of large-scale model development.

Meta's Full-Stack AI Infrastructure Overhaul: Custom Silicon, Exascale Compute, and Next-Gen Data Centers

Meta has reoriented its entire infrastructure strategy around AI as the primary workload, moving from general-purpose compute to a vertically integrated stack spanning custom silicon (MTIA for inference, MSVP for video), purpose-built AI data centers with liquid cooling, a 16,000-GPU AI Research Supercluster (RSC) delivering ~5 exaflops, and PyTorch 2.0's new graph-mode compiler stack. The company frames AI workloads as growing at 1,000x every two years, necessitating end-to-end co-design of chip, system, network, and software rather than incremental adaptation of existing web-scale infrastructure. Key differentiators include ownership of the full stack from Silicon to kernel to framework, enabling optimizations impossible with off-the-shelf solutions. Open-sourcing foundational components (LLaMA weights, PyTorch 2.0) is treated as a strategic accelerant for ecosystem and talent, not a concession.

Meta's AI-First Pivot: Consolidating Generative AI as the Core of Its Business and Platform Strategy

Meta has reorganized its generative AI efforts under a single org, signaling a strategic shift where AI is no longer a supporting function but the foundational layer across ads, Reels, Reality Labs, and its Family of Apps. Executives frame AI as the substrate for Meta's "next major computing platform," with generative capabilities — image generation, live speech translation, multimodal understanding — moving from research novelty to production deployment. The pace of progress is cited as a distinguishing factor: capabilities that were impossible a few years ago, such as image-to-text description via computer vision, are now baseline. Meta positions its internal talent concentration as a competitive moat for staying at the frontier.

Meta's Vertical AI Infrastructure Stack: Custom Silicon, Exascale Compute, and the End of General-Purpose Hardware

Meta is executing a full-stack AI infrastructure overhaul — from custom silicon to data center architecture — driven by AI workloads growing at 1000x every two years. The company has developed two in-house chips (MTIA for ML inference/recommendation and MSVP for video encoding) to maximize performance-per-watt, bypassing GPU generality for domain-specific efficiency. Their Research Supercluster (RSC), with 16,000 GPUs and ~5 exaflops of compute, represents one of the largest AI supercomputers operational today. The core thesis: at Meta's scale (serving ~half of humanity), off-the-shelf hardware is structurally insufficient, and vertical integration of silicon, software, and data center design is the only viable path.

Meta's SAM 3 Unifies Detection, Segmentation, and Tracking with Multi-Modal Prompting

Meta has released SAM 3 (Segment Anything Model 3), a unified model that extends the original SAM's click-based prompting with text and visual prompting capabilities, enabling detection, segmentation, and tracking across both images and videos. The addition of text prompts allows batch segmentation of object categories simultaneously, reducing manual effort. Visual prompting lets users select an object to surface similar ones in the same image, with iterative follow-up prompts for refinement. SAM 3 is already integrated into production Meta products, specifically powering new effects in Instagram's Edits app.

Meta's SAM 3D Brings Zero-Shot Image-to-3D Reconstruction with Human Body Specialization

Meta has introduced SAM 3D, a pair of models extending the Segment Anything Model into the 3D domain, enabling geometry and texture reconstruction for any object in a single image — including occluded or non-visible surfaces. A specialized variant focuses on human body reconstruction, generating accurate meshes of body shape and pose even for partially hidden individuals or those in uncommon poses. The system targets practical deployment across robotics, scientific research, and consumer platforms like Facebook Marketplace, and is accessible via the Segment Anything Playground.

Meta's SAM Audio: Multimodal Audio Isolation and Source Separation

SAM Audio is a state-of-the-art model designed for the isolation of specific sounds within complex audio mixes. It leverages text, visual, and span-based prompts to extract distinct elements of speech, music, and general environmental noise.