absorb.md — A knowledge graph of what AI thinkers are actually saying

paper / charleneli / 5d ago

TideGS Breaks the GPU Memory Barrier for 3D Gaussian Splatting at Billion-Primitive Scale

3D Gaussian Splatting (3DGS) training has been constrained to tens of millions of primitives on single-GPU hardware due to the memory footprint of per-Gaussian attribute vectors. TideGS exploits the inherent sparsity of 3DGS training — only camera-visible Gaussians are active per iteration — to treat GPU memory as a working-set cache backed by an SSD-CPU-GPU hierarchy. Three co-designed techniques (block-virtualized geometry, hierarchical async I/O pipelining, and trajectory-adaptive differential streaming) enable training over one billion Gaussians on a single 24 GB GPU, surpassing prior out-of-core baselines (~100M) and standard in-memory approaches (~11M) while achieving superior reconstruction quality on large-scale scenes.

3d-gaussian-splattingout-of-core-optimizationlarge-scale-trainingcomputer-visionmemory-managementneural-renderinggpu-computing

“TideGS enables training of over one billion 3D Gaussian primitives on a single 24 GB GPU.”

paper / charleneli / 5d ago

MSAVBench: The First Benchmark Exposing Critical Gaps in Multi-Shot Audio-Video Generation

Multi-shot audio-video (MSAV) generation represents the frontier of video synthesis, but existing benchmarks lack the scope and rigor to evaluate it reliably. MSAVBench introduces a four-dimensional evaluation framework (video, audio, shot, reference) covering up to 15 shots and non-realistic scenarios, paired with an adaptive hybrid evaluation pipeline that achieves 91.5% Spearman rank correlation with human judgments. Systematic evaluation of 19 state-of-the-art models reveals that fine-grained audio-visual synchronization and director-level control remain unsolved, with modular/agentic pipelines showing the most promise for closing the open- vs. closed-source performance gap.

video-generationaudio-video-synthesisai-evalsbenchmarkingmultimodal-aicomputer-visiongenerative-models

“MSAVBench is the first comprehensive benchmark specifically designed for multi-shot audio-video generation evaluation.”

paper / charleneli / 10d ago / failed

Deep Mixture of Experts Network for Resource Optimization in Aerial-Terrestrial CF-mMIMO Systems under URLLC

paper / charleneli / 10d ago / failed

Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation

paper / charleneli / 21d ago / failed

EvoPoC: Automated Exploit Synthesis for DeFi Smart Contracts via Hierarchical Knowledge Graphs

paper / charleneli / 21d ago / failed

AGN STORM 2. XII. Ground-Based Optical Photometry and Lag Measurements of Mrk 817

paper / charleneli / 21d ago / failed

Interlayer Five-Spin Polaron in Superconducting Bilayer Nickelates

paper / charleneli / 22d ago / failed

Penalized Likelihood for Dyadic Network Formation Models with Degree Heterogeneity

youtube / charleneli / 25d ago / failed

The Agentic Revolution: A Productive and Destructive Debate on AI

paper / charleneli / 29d ago

Adaptive Re-splitting with Shared Neural Policy Accelerates Parallel Trajectory Optimization

ATRS integrates a shared Deep Reinforcement Learning policy into parallel ADMM-based trajectory optimization to dynamically re-split stagnating segments. Formulated as a Multi-Agent Shared-Policy MDP, it achieves size invariance and zero-shot generalization by relying on solver internal states, not environment geometry. A confidence-based mechanism ensures stability by re-splitting only the most problematic segment, yielding up to 26% fewer iterations and 19.1% less computation time in simulations, with real-world validation under 35 ms cycles.

trajectory-optimizationadmmdeep-reinforcement-learningmotion-planningroboticsparallel-optimizationmulti-agent-rl

“Existing fixed-structure parallel ADMM decompositions cause optimization stagnation in constrained regions due to lagging subproblems.”

paper / charleneli / 29d ago

V1 Functions as Saccadic Motor Cortex, Information Bottleneck, and Feedback Supplier for Recognition

V1 constructs a bottom-up saliency map to guide exogenous saccades, functioning as a motor cortex for eye movements. It imposes a processing bottleneck by massively reducing visual information at its output to downstream areas. V1 supports recognition in these areas via top-down feedback queries, primarily targeting central visual field representations, framing vision as selective looking through saccades and seeing via the bottleneck.

visual-cortexv1-functionssaliency-mapsaccadesvisual-processingneural-bottlenecktop-down-feedback

“V1 acts as a motor cortex for exogenously guiding saccades by constructing a bottom-up saliency map”

paper / charleneli / 29d ago

Inter-Stance: Pioneering Multimodal Dyadic Corpus Enables Interpersonal Stance and Affect Modeling

Inter-Stance introduces the first publicly available multimodal dataset of 45 dyads (90 participants) capturing synchronized 2D/3D face videos, thermal dynamics, voice/speech, physiology (PPG, EDA, heart rate, blood pressure, respiration), and self-reported affect during communicative interactions. It includes dyads with shared history and strangers, annotated for social signals, agreement, disagreement, and neutral stance, with potent emotion induction. The 20TB corpus supports novel modeling of dyadic multimodal behaviors, demonstrated via experiments on communication patterns and affect influenced by interpersonal history.

multimodal-corpusdyadic-interactionstance-analysissocial-signalscomputer-visionaffect-recognition

“No prior publicly-available dataset includes multimodal recordings and self-report measures of multiple persons in social interaction with dyadic recordings and annotations.”

paper / charleneli / Apr 26

UniGenDet Unifies Generative and Discriminative Paradigms for Co-Evolving Image Synthesis and Forgery Detection

UniGenDet introduces a unified framework that jointly optimizes image generation and generated image detection through symbiotic multimodal self-attention and detector-informed generative alignment. This co-evolutionary approach leverages adversarial synergy to bridge architectural gaps, enhancing generation fidelity via authenticity feedback and improving detection interpretability. Experiments across datasets confirm state-of-the-art results in both tasks.

image-generationgenerated-image-detectionunified-frameworkco-evolutionary-learningcomputer-visionadversarial-trainingself-attention

“UniGenDet is a unified generative-discriminative framework for co-evolutionary image generation and generated image detection”

paper / charleneli / Apr 26

VistaBot Enables Calibration-Free View-Robust Robot Manipulation via Geometry-Aware Video Synthesis

VistaBot integrates feed-forward 4D geometry estimation, view synthesis latent extraction, and latent action learning to produce novel viewpoints from fixed-camera training data, enabling robust closed-loop manipulation under test-time viewpoint changes without calibration. It enhances action-chunking (ACT) and diffusion-based (π₀) policies, achieving 2.79× and 2.63× improvements in View Generalization Score (VGS) across simulation and real-world tasks. The framework also delivers high-quality novel view synthesis, with code and models to be released publicly.

robot-manipulationview-synthesisview-robustnessdiffusion-models4d-geometryrobotics-policies

“VistaBot achieves view-robust closed-loop manipulation without requiring camera calibration at test time”

paper / charleneli / Apr 26

Omni Model Enables Context Unrolling for Multimodal Reasoning Across Text, Image, Video, and 3D

Omni is a unified model trained natively on text, images, videos, 3D geometry, and hidden representations, inducing Context Unrolling where it reasons across multiple modal representations prior to prediction. This aggregates complementary information from heterogeneous modalities, approximating the shared multimodal knowledge manifold more faithfully. Consequently, Omni excels in multimodal generation, understanding, and advanced reasoning tasks like in-context generation of text, images, videos, and 3D geometry.

multimodal-modelscontext-unrollingomni-modelcomputer-visionmultimodal-reasoningai-research-paper

“Omni is natively trained on diverse modalities including text, images, videos, 3D geometry, and hidden representations”

tweet / @charleneli / Apr 20

Charlene Li Examines Agentic AI's Current Reality and Future Trajectory

Charlene Li's analysis demystifies agentic AI, detailing its actual current capabilities and limitations. It provides a realistic assessment of ongoing developments. The piece outlines probable next steps in agentic AI evolution for technical practitioners.

agentic-aicharlene-liai-trendsai-futurex-feed

“Agentic AI is currently experiencing specific developments beyond hype”

tweet / @charleneli / Apr 20 / failed

Data, Data Everywhere, but Not an Insight to Drink (Myths vs. Reality) https://twitter.com/i/broadcasts/1ynJOlemqEVxR

tweet / @charleneli / Apr 20 / failed

Are enterprise platforms about to face a mass exodus? https://twitter.com/i/broadcasts/1ynJOlXWVXyxR

tweet / @charleneli / Apr 20 / failed

The top questions boards should be asking about AI https://twitter.com/i/broadcasts/1OwxWXbeBDWKQ

tweet / @charleneli / Apr 18

Preparing Teams for AI-Driven Workflows

Charlene Li hosts a live broadcast on strategies to ready organizational teams for AI integration. The session targets practical preparation amid accelerating AI adoption. Technical leaders can access it via X Spaces for actionable insights on team readiness.

ai-futureteam-preparationcharlene-liworkforce-trainingai-adoptionleadership-strategy

“Charlene Li is conducting a live session on preparing teams for an AI future”

tweet / @charleneli / Apr 18

AI Culture Transformation Enters Disruptive Messy Middle Phase

Organizations adopting AI are navigating the "messy middle" stage of culture change, characterized by disruption and resistance following initial enthusiasm. This phase demands structured strategies to manage uncertainty and embed AI practices. Technical leaders must prioritize change management to transition beyond early hype toward sustainable integration.

ai-cultureorganizational-changecharlene-litwitter-spacesai-adoption

“AI adoption follows a 'messy middle' phase in organizational culture change”

tweet / @charleneli / Apr 18

AI Resistance Stems from Psychological Barriers, Addressed via Constraint-Led Leadership

Resistance to AI adoption arises from deep-seated psychological factors. Effective leadership counters this by imposing constraints that channel innovation. This approach transforms limitations into strategic advantages for AI integration.

ai-resistancepsychologyleadershipconstraintscharlene-litwitter-spaces

“AI resistance is primarily driven by psychological factors”

tweet / @charleneli / Apr 18

AI Disruption Demands Six-Quarter Planning Over Annual Cycles

AI's rapid evolution requires organizations to replace static annual planning with a dynamic six-quarter walk for adaptive strategy. This approach enables quarterly pivots based on emerging AI capabilities and market shifts. Technical teams can use it to align roadmaps with accelerating innovation timelines.

ai-adaptationbusiness-strategyannual-planningcharlene-lisix-quarter-walkleadership

“Traditional annual planning is insufficient for adapting to AI changes”

tweet / @charleneli / Apr 18

Charlene Li Launches Hourly Poll on AI Strategy Communication Tactics

Charlene Li is running an hourly poll via her X feed to gauge how organizations communicate their AI strategies. The poll links to a live broadcast session. This reflects ongoing interest in practical AI adoption messaging among tech leaders.

ai-strategycommunicationtwitter-spacescharlene-lihourly-poll

“Charlene Li is conducting an hourly poll on her X feed”

tweet / @charleneli / Apr 18

Charlene Li Shares Two-Year Reflections on Global Book Success

Charlene Li is hosting a live broadcast to reflect on her experiences after her book has been available worldwide for over two years. The session captures author insights from sustained international publication. Technical audiences may find value in her discussion of long-term book distribution and reception strategies.

charlene-litwitter-feedauthor-reflectionsbook-anniversaryhourly-poll

“Charlene Li has a book that has been in the world for more than 2 years”

youtube / charleneli / Apr 14 / failed

Charlene Li on the Real Cost of Ignoring AI

youtube / charleneli / Apr 12 / failed

Ep 113: Leading with Business Strategy to Deliver Sustainable AI Value with Charlene Li

paper / charleneli / Apr 12

NUMINA: Improving Numerical Accuracy in Text-to-Video Diffusion Models

Text-to-video diffusion models frequently fail to generate the correct quantity of objects specified in prompts. NUMINA, a training-free framework, addresses this by identifying prompt-layout inconsistencies usingattention heads to create a countable latent layout. It then refines this layout and modulates cross-attention to improve numerical alignment. This method significantly enhances counting accuracy and CLIP alignment while maintaining temporal consistency.

text-to-videodiffusion-modelsnumerical-alignmentcomputer-visionai-research

“Text-to-video diffusion models struggle with generating the accurate number of objects from text prompts.”

paper / charleneli / Apr 12

ETCH-X: A Robust and Expressive Human Body Fitting Method for Clothed 3D Scans

ETCH-X is a novel human body fitting method designed to improve both the expressiveness and robustness of fitting parametric body models like SMPL-X to 3D point clouds of clothed humans. It achieves this through a tightness-aware fitting paradigm that filters out clothing dynamics, utilizes implicit dense correspondences for fine-grained fitting, and leverages disentangled, scalable training on diverse composable datasets. This approach significantly enhances performance on both seen and unseen data, addressing limitations of prior methods that excelled in only one aspect.

3d-modelingcomputer-visionbody-fittingclothed-humanssmpl-xdeep-learninggeometric-deep-learning

“ETCH-X utilizes a 'tightness-aware fitting paradigm' to mitigate the challenges posed by clothing dynamics in human body fitting.”

youtube / charleneli / Apr 2 / failed

AI Success in 90 Days?

youtube / charleneli / Apr 2

From Productivity Tool to Strategic Force: The Framework for Corporate AI Integration

Effective corporate AI integration requires shifting from a 'tool-first' productivity mindset to a 'strategy-first' approach where AI supports existing business objectives. The goal for leaders is to achieve 'AI fluency'—integrating the technology into daily workflows to augment human judgment, empathy, and wisdom (the '20%') rather than simply automating bad processes. Success is measured not by the number of use cases, but by the ability to use AI to drive customer engagement and business reinvention while maintaining ethical guardrails via a trust pyramid.

ai-adoptioncorporate-strategyorganizational-transformationai-leadershipstorytelling-in-businessfuture-of-workai-implementation

“AI typically handles roughly 80% of standard output, while the remaining 20%—composed of authenticity, unique voice, and deep insight—represents the primary competitive advantage for humans.”

tweet / @charleneli / Jan 27

Charlene Li Reflects on Two Years Post-Book Launch with an X Feed Poll

Charlene Li engaged her audience through an hourly poll on her X feed, two years after her book's release. This initiative serves as a direct author reflection on the reception and longevity of her work, leveraging social media for real-time audience interaction and feedback.

author-reflectionssocial-media-insightscontent-strategypersonal-branding

“Charlene Li conducted an hourly poll on her X feed.”

tweet / @charleneli / Jun 3

AI Strategy Communication Is a Live Conversation, Not a Memo

Charlene Li, a prominent digital transformation analyst, posed a public poll asking how organizations are communicating their AI strategy, signaling that internal and external AI communication is an emerging leadership challenge. The post links to a live broadcast, suggesting the topic warrants real-time, interactive discussion rather than static guidance. The framing implies a gap between organizations having an AI strategy and effectively communicating it to stakeholders.

ai-strategyleadershipcommunicationexecutive-insightssocial-media

“Communicating an AI strategy is a distinct, non-trivial challenge separate from formulating one.”

tweet / @charleneli / May 27

Businesses need to adapt planning to AI pace

Traditional annual planning cycles are too slow for the rapid advancements in AI. The "Six-Quarter Walk" is introduced as a more agile planning methodology, enabling businesses to continuously adapt to technological shifts. This approach emphasizes shorter planning horizons and frequent reassessments to integrate AI

annual-planningai-adaptationsix-quarter-walkstrategic-planningbusiness-strategy

“Traditional annual planning is insufficient for the pace of AI innovation.”

tweet / @charleneli / May 13

Leading Through AI Resistance: The Psychology Behind Organizational Pushback

Charlene Li, a recognized leadership and digital disruption analyst, hosted a broadcast exploring the psychological dimensions of resistance to AI adoption and how leaders can navigate organizational constraints. The content suggests a framework for understanding why individuals and teams resist AI, and how effective leadership can reframe constraints as catalysts. This is consistent with Li's broader body of work on disruptive leadership and change management in the context of emerging technology.

ai-adoptionleadershipchange-managementai-resistanceorganizational-psychology

“AI adoption faces meaningful psychological resistance, not just technical or structural barriers.”

tweet / @charleneli / Apr 29

AI Culture Change Presents "Messy Middle" Challenges

Charlene Li identifies a "messy middle" phase in AI culture change, implying a period of significant uncertainty and difficulty between initial adoption and full integration. Organizations navigating this phase likely face challenges in adapting processes, skills, and mindsets to effectively leverage AI. Overcoming this "messy middle" is crucial for successful AI transformation.

ai-cultureorganizational-changeai-adoptionleadershipdigital-transformation

“AI culture change involves a 'messy middle' phase.”

tweet / @charleneli / Apr 22

Preparing Teams for an AI Future

The provided content is a link to a broadcast by Charlene Li on preparing teams for an AI future. Without access to the broadcast content itself, it is impossible to extract specific claims, evidence, or a detailed synthesis. Further analysis requires the actual broadcast material.

ai-futureteam-developmentworkforce-planningleadership-developmenttechnological-impactorganizational-change