absorb.md

Garry Tan

Chronological feed of everything captured from Garry Tan.

AgentGA: A Novel Genetic Algorithm for Autonomous Code Generation

AgentGA introduces a new framework for evolving autonomous code-generation runs by optimizing the agent seed, which comprises the task prompt and optional parent archives. This system couples a population-level genetic algorithm with long-horizon agents, utilizing a deterministic 1:1 elite tournament for selection and an adaptively controlled operator allocation. The core innovation lies in searching over reusable starting conditions rather than directly modifying code, enabling inherited artifacts to improve subsequent autonomous runs.

Geo2Sound: Generating Acoustically Realistic Soundscapes from Satellite Imagery

Geo2Sound is a novel framework that addresses the challenge of generating realistic soundscapes from satellite imagery. It uniquely combines structural geospatial attribute modeling, semantic hypothesis expansion, and geo-acoustic alignment. This approach allows for the generation of acoustically plausible and geographically consistent soundscapes, outperforming existing baselines.

Switch: Hierarchical Multi-Skill System for Agile Humanoid Locomotion

The "Switch" system addresses limitations in humanoid robot skill transitions by introducing a hierarchical multi-skill framework. This framework utilizes a Skill Graph (SG) for kinematically similar transitions, a deep reinforcement learning-trained whole-body tracking policy, and an online skill scheduler. The scheduler enables real-time, robust execution and smooth transitions between diverse locomotion skills, enhancing safety and practical applicability of humanoid robots.

UniDoc-RL: Enhancing Visual RAG with Hierarchical Reinforcement Learning

UniDoc-RL is a novel reinforcement learning framework for visual Retrieval-Augmented Generation (RAG) that addresses the limitations of generic retrieval signals in existing systems. By formulating visual information acquisition as a sequential decision-making problem with a hierarchical action space, UniDoc-RL refines visual evidence from coarse-grained document retrieval to fine-grained image selection and active region cropping. The framework utilizes a dense multi-reward scheme and Group Relative Policy Optimization (GRPO) for effective end-to-end training without a separate value network, achieving significant performance gains on benchmarks.

Garry Tan Persists with Opus 4.6 via API Key in 2025

Garry Tan continues using Opus model version 4.6 accessed through an API key, as shared in an hourly poll on his X feed. This indicates preference for a specific older model iteration over newer alternatives. The setup relies on direct API integration without mention of platform-specific changes.

Garry Tan Clarifies Misleading "Hourly Poll" Framing of His X Feed

A user note labeled an hourly poll on Garry Tan's X feed, prompting an alarmed reaction ("Oh yikes"). Tan immediately calls for clarification to correct the potentially misleading or erroneous description. This indicates proactive error correction in real-time social media monitoring.

User-Driven Taste Customization in Agentic Note-Taking Systems

Garry Tan's SOUL md tool introduces dynamic taste supply during agent interactions for personalized content generation. Users provide taste preferences conversationally with the agent. Results vary by individual (YMMV), indicating subjective personalization.

Thin Agent Harnesses Maximize Fat Skills and Code for Agentic Engineering

Agentic engineering optimizes by offloading fuzzy, human-like operations into expansive markdown-based skills and precise deterministic tasks into robust codebases. The orchestration harness remains minimal to avoid bloat. This contrasts misconceptions like prioritizing "fat harnesses," emphasizing instead "thin harness, fat skills and code."

Garry Tan's X Feed Attracts Strong Hourly Engagement

Garry Tan's X feed prompts an hourly poll that receives an affirmative response. The user note indicates ongoing monitoring of his feed via hourly polls. The explicit "YES" suggests positive reception or validation in the poll results.

OpenClaw AI Agents Generate "Prompt Reports" for Collaborative Debugging

Garry Tan and collaborators share bug reports from their OpenClaw AI agents to troubleshoot issues in task execution. This mirrors GitHub's issue tracking but applies to AI prompts, termed "prompt reports." A user example highlights repeated reminders needed for an agent to adopt a tool like Gbrain after self-installation.

Garry Tan Builds AI Coding Tool in 9 Days Using His Own Hourly X Feed

Garry Tan has utilized his hourly-poll-generated X feed as a dataset for 3 months to develop gbrain, an open-source AI tool. He constructed the entire gbrain project from scratch in just 9 days. This demonstrates rapid prototyping of specialized LLMs leveraging personal social media archives.

Garry Tan Adopts Hotel Room Mascot as Personal Good Luck Charm

Garry Tan spotted a small figure in his hotel room and designated it as his good luck charm. This casual endorsement highlights a superstitious ritual in his otherwise tech-focused persona. The item remains in situ, serving as an impromptu talisman during his stay.

Garry Tan Signals Heavy Financial Investment in AI or X Ecosystem

Garry Tan reports currently allocating substantial funds ("a lot of dollars") into an unspecified high-value endeavor, shared via an hourly poll on his X feed. This indicates active capital deployment, likely into tech startups, AI, or platform enhancements given his Y Combinator leadership. The casual phrasing underscores ongoing, significant financial commitment without detailing recipients or purposes.

Private Group Chats in Claw Machines Enable Discreet Social Gaming

Moltbook introduces private group chats integrated into claw machine games, allowing users to connect socially without public visibility. Garry Tan endorses this feature as a strong innovation. The concept blends physical arcade gaming with private digital communication for friends.

Older entries →