Bitter Lesson
βAnd of course I I reference the bitter lesson all the time as one of the deep fundamental insights to the kind of broader effort of everything that goes on hereβ
What the smart people are recommending. 7865 books, tools, and products endorsed by the thinkers absorb.md tracks. Ranked by how many times each has been recommended across compiled podcasts, papers, posts, and tweets.
βAnd of course I I reference the bitter lesson all the time as one of the deep fundamental insights to the kind of broader effort of everything that goes on hereβ
βDecoupled DiLoCo explores how to continuously train AI models without ever stopping due to failures. https://t.co/1H6DyKrB_k42SBaMβ
βTo bridge this gap, we propose Language-Agnostic Utility-driven Reranker Alignment (LAURA), which aligns multilingual evidence ranking with downstream generative utility.β
βThis paper presents FΒ²LP-AP (Fast and Flexible Label Propagation with Adaptive Propagation Kernel), a training-free, computationally efficient framework...β
βRead our research: https://research.perplexity.ai/articles/advancing-search-augmented-language-modelsβ
βDive into the technical details β https://deepmind.google/blog/decoupled-diloco/?utm_source=x&utm_medium=social&utm_campaign=&utm_content=β
βI endorse everything in this manifesto.β
βGift link: Yale Has Come Up With a Surefire Way to Make a Terrible Situation Worse https://t.co/LN1OWwM5hXβ
βIn this work, we present SCALA (Signaling CA with Local Attraction), a novel non-hierarchical cellular automaton decoder for quantum repetition and toric codes.β
βBy evaluating SCALA alongside the hierarchical CA decoder proposed by Harrington, we provide a direct comparison between non-hierarchical and renormalization-group-style local decoding strategies.β
βwe propose StructMem, a structure-enriched hierarchical memory framework that preserves event-level bindings and induces cross-event connections. By temporally anchoring dual perspectives and performiβ¦β
βthis work introduces A-THENA, a lightweight early intrusion detection system (EIDS) that significantly extends preliminary findings on time-aware encodings.β
βTo support reproducibility and further research, we will publicly release our evaluation benchmark, preference training dataset, and code at https://pegah-kh.github.io/projects/prompts-override-visionβ¦β
βTo support reproducibility and further research, we will publicly release our evaluation benchmark, preference training dataset, and code at https://pegah-kh.github.io/projects/prompts-override-visionβ¦β
βIt has become clear that Hartigan's algorithm (1975) gives better results in almost all casesβ
βTelgarsky-Vattani note a typical improvement of 5% -- 10%.β
βIn this paper, we present a transferable learning approach for PINNs premised on a fast Pseudoinverse PINN framework (Pi-PINN).β
βOur experiments demonstrate that SAE-SPLADE achieves retrieval performance comparable to SPLADE on both in-domain and out-of-domain tasks while offering improved efficiency.β
βTo address this, we introduce OptiVerse, a comprehensive benchmark... OptiVerse will serve as a foundational platform for advancing LLMs in solving complex optimization challenges.β
βThe code and models will be made publicly available.β
βProject website, videos and code: https://scout-comm.github.io/β
βthe density matrix prescription by Brody and Graefe [Phys. Rev. Lett. 109, 230405 (2012)]β
βIn this paper, we present Cornetto, the first benchmark to evaluate LLM-driven network configuration repair functionally and at scale.β
βTo read our write-up in full, see here: https://www.anthropic.com/features/project-dealβ
βMore on CursorBench: https://cursor.com/blog/cursorbenchβ
βIn a recent work [Phys. Rev. B 111, 214503 (2025)] we derived the quantum phase dynamics from a many-body treatment which leads to an effective gate voltage-dependent Hamiltonianβ
βabsolutely! here's how we do it: https://openai.com/index/unlocking-the-codex-harness/β
βWe expand existing LM-based exploration (El-Naggar et al., 2025a,b) with a simple CL variant and find that CL substantially impacts the apparent inductive bias of LMs.β
βEverybody loves entropy, yet somehow we are always working hard to reduce it. http://quantumfrontiers.com/2026/04/12/how-i-learned-to-stop-worrying-andno-ive-always-adored-entropy/β
βFind out more β https://deepmind.google/blog/ai-co-clinician/β
βWe demonstrate this by integrating GIST with two substantially different existing methods, LaDeCo and Design-o-meter.β
βWe demonstrate this by integrating GIST with two substantially different existing methods, LaDeCo and Design-o-meter.β
βI was quoted a couple times in this Atlantic article, but that isnβt (the only) reason I think it is good. It lays out the reasons why we whipsawed from βAI is a bubbleβ to βthere are not enough data β¦β
βfuller writeup: https://www.latent.space/p/ainews-agents-for-everything-elseβ
βCompanion paper arXiv:2603.20997 (Basu, 2026) defines the routing diagnostic task.β
βWe propose CRAFT (Clustered Regression for Adaptive Filtering of Training data), a vectorization-agnostic selection method for training sequence-to-sequence models.β
βTo resolve these challenges, we propose MADE-IT (Manifold-Aware Dynamic Expert Evolution and Implicit rouTing), an adaptive CMM method...β
βIn this paper, we present pliable rejection sampling (PRS), a new approach to rejection sampling, where we learn the sampling proposal using a kernel estimator.β
β(Calandriello et al. 2016) propose INK-Estimate, an algorithm that processes the dataset incrementally and updates RLS, effective dimension, and Nystrom approximations on-the-fly.β
βIn this paper we introduce SQUEAK, a new algorithm that builds on INK-Estimate but uses unnormalized RLS.β
βA comprehensive simulation study shows that the conformalized SL achieves valid finite-sample coverage with competitive performance relative to the true data-generating mechanism. A central contributiβ¦β
βWe introduce HiLight, an Evidence Emphasis framework that decouples evidence selection from reasoning for frozen LLM solvers. HiLight avoids compressing or rewriting the input, which can discard or diβ¦β
βThe increasing adoption of AI systems in hiring has raised concerns about algorithmic bias and accountability, prompting regulatory responses including the EU AI Act, NYC Local Law 144, and Colorado'sβ¦β
βThe increasing adoption of AI systems in hiring has raised concerns about algorithmic bias and accountability, prompting regulatory responses including the EU AI Act, NYC Local Law 144, and Colorado'sβ¦β
βThe increasing adoption of AI systems in hiring has raised concerns about algorithmic bias and accountability, prompting regulatory responses including the EU AI Act, NYC Local Law 144, and Colorado'sβ¦β
βAudio samples are available at https://qiangchunyu.github.io/UniSonate/.β
βIn this paper, we formulate routing as a budget allocation problem and identify marginal gain... we propose RouteLMT (routing for LLM-based MT), an efficient in-model router... Extensive experiments dβ¦β
βProject page: https://muzhancun.github.io/preprints/DROL.β
βWe propose Hyperparameter-Divergent Ensemble Training (HDET), a method that repurposes these replicas for simultaneous learning rate exploration at negligible communication overhead.β
βWe introduce SOB (The Structured Output Benchmark), a multi-source benchmark spanning three source modalities: native text, images, and audio conversations.β
βI've long objected to the 'surveillance advertising' trope, for it trivializes actual surveillance under force of government, such as this, brought to us by the folks at Palantir. Gift link.β
βWe test whether the causal inner product of \citet{park2024linear} -- defined by the unembedding covariance $Ξ£$ -- enables cross-lingual concept transport.β
βRemarkable @DIEZEIT story: a researcher brings a last letter from a communist executed by the Nazis to the daughter he never knew.β
βresolve the open question of Gaillard, Gerchinovitz, Huard, and Stoltz, \emph{``Uniform regret bounds over $\mathbb{R}^d$ for the sequential linear regression problem with the square loss''} (ALT 2019β¦β
βWe introduce Perturb-and-Correct (P&C), a post-hoc method for constructing epistemically diverse predictors from a single pretrained network.β
βWe present Basis Selection with Importance (BSI), a principled low-rank compression framework that ranks and prunes bases by directly estimating the expected loss increase incurred when each basis is β¦β
βIn this paper, we develop two metrics for critically examining this assumption: Causal Importance of Reasoning (CIR)... and Sufficiency of Reasoning (SR)...β
βwe're introducing the Open Molecules 2025 data set and Meta's universal model for atoms...By making open molecules and universal model available, we're enabling researchers to drive innovationβ
βWe formalize this setting as reinforcement learning with rich feedback and introduce Self-Distillation Policy Optimization (SDPO)β
βI encourage you to explore our full blog post for more details and together let's push the boundaries of AI research to solve the big scientific questions about human and machine intelligence.β
βjust last week the Transformers team wrote a really, really cool blog post about MoEs and how they work. So if anybody watching is interested, that's a great way to get into it.β
βwhich is used in arXiv:2511.09371 to establish a 6-functor formalism of complex analytic motivic homotopy theory and produce an analytification map that is compatible with the six operationsβ
βVanguard uh comes out with a report every year they just came out for with their 2023 report which has the 2022 data and this is a this is something I follow as soon as it comes out I look it upβ
βI think people should actually read the kind of management discussion section that are in these S1s... it can look quite dense, but it's worth reading.β
βyou may have seen when you published an excerpt in The Atlantic, I just surfaced it for readers because I thought it was great.β
βAnd then there's that 17,000 word profile of Sam Altman by Ronan Farrow. Yeah, I mean, a fascinating piece of journalism for sure.β
βwhose complexity is asymptotically quadratically faster than its classical counterpart in [Wang & Ling, IEEE Trans. Inf. Theory 2019]β
βwe derive two versions of quantum dual attacks that improve upon the previous ones in [Pouly & Shen, EUROCRYPT 2024]β
βOur quantum Discrete Gaussian sampler can also be used to speed up the algorithm for solving the Short Integer Solution problem, in any norm, of [Bollauf, Pouly & Shen, ePrint 2026/225].β
βThis motivated us to show how the Hamilton-Ostrogradski formalism can be applied it to the Pais-Uhlenbeck oscillator. We hope that the approach presented in this work can serve as a basis for discussiβ¦β
βThis release represents IceCube's most sensitive and comprehensive publicly available all-sky muon track dataset to date and should be preferred over previous releases.β
βEmpirically, using only just three such steps improves sample quality over strong diagonal-covariance baselines, including OCM-DDPM, across standard image benchmarks.β
β146 of the 24,576 features in the layer-8 residual-stream SAE release of Bloom (2024) clear a Holm-corrected significance thresholdβ
βAt a 16. Z had a great essay, on flock. We can really, really solve crime. And it's just a choice.β
βWe also introduce a pixel-aligned RGB-thermal instruction-tuning dataset and Thermo-VL-Bench, a manually screened RGB-thermal VQA benchmark for low-light and cross-spectrum reasoning.β
βmckinsey put out a study showing investing in clean energy clean economy creates you know twice as many jobs as if you're investing in some of the heavy carbon oil and gas futuresβ
βwe have a new report out today actually that tries to tackle exactly that question and it takes the data really seriously and it finds that there are really serious connections between economic growthβ¦β
βA closed-loop meta-control layer (Extension A) reduces forgetting by an additional 81% at ~398M, mapping onto the System A and System M roles of Dupoux et al. (arXiv:2603.15381).β
βwe evaluate Claude Code in an agentic proving framework on CLEVER, a Lean 4 benchmark for verifiable code generationβ
βAbstentionBench is used as the evaluation benchmark, as this work aims to contribute to the field of abstention learning. All datasets on the benchmark were tested against this method and various baseβ¦β
βThe question this paper addresses is: does this mathematical claim hold at the engineering level? To make the answer as general as possible, we deliberately choose the strictest engineering conditionsβ
βTraining data comes from approximately 100 expert chain-of-thought (CoT) annotations produced by the BC Protocol (Zou & Xu, 2026b).β
βThis is the structural reason why ~100 CoT examples are sufficient -- not a purely empirical observation like LIMA (Zhou et al., 2023).β
βWe present SWE-chat, the first large-scale dataset of real coding agent sessions collected from open-source developers in the wild.β
βWe release Llamion, a family of 14B-parameter open-weight language models...We release three checkpoints (Base, Chat, LongChat) that load with trust_remote_code=False in the Hugging Face Transformers β¦β
βThe New York Times story: https://www.nytimes.com/2024/12/12/technology/ev-williams-twitter-medium-mozi.html?smid=nytcore-ios-share&referringSource=articleShareβ