Chronological feed of everything captured from Alán Aspuru-Guzik.
paper / aspuruguzik / 29d ago
A hybrid quantum-classical workflow computes Auger electron spectra by integrating generative quantum eigensolver (GQE) for ground-state preparation, quantum self-consistent equation-of-motion for excitations, and one-centre approximation for transition rates. GQE employs a GPT-2 model to generate optimized quantum circuits, enabling HPC parallelization and GPU acceleration for scalable performance. Demonstrated on water molecule with STO-3G basis, it matches classical full CI, computational, and experimental spectra, while using half the gate count of VQE.
quantum-chemistryauger-spectroscopygenerative-quantum-eigensolverquantum-eigensolvermolecular-excitationsquantum-simulation
“The hybrid quantum-classical workflow accurately calculates Auger spectra of water using STO-3G basis.”
paper / aspuruguzik / Feb 23
MōLe is an equivariant neural network that predicts coupled-cluster excitation amplitudes directly from Hartree-Fock molecular orbitals, bypassing DFT's accuracy limitations. Trained solely on small equilibrium geometries, it exhibits strong data efficiency and generalizes to larger molecules and off-equilibrium structures. It also accelerates CC convergence by reducing required iterations, enabling scalable high-accuracy wavefunction modeling.
molecular-orbital-learningcoupled-clusterneural-wavefunctionsquantum-chemistrymachine-learningchemical-physicsequivariant-models
“Coupled-cluster (CC) theory is the gold standard in quantum chemistry for accuracy beyond DFT.”
paper / aspuruguzik / Feb 19
El Agente Gráfico integrates LLM decision-making into a type-safe execution environment using dynamic knowledge graphs and object-graph mappers for structured state management via typed Python objects. This replaces unstructured text with symbolic identifiers, enhancing context consistency, provenance tracking, and tool orchestration. A single agent outperforms multi-agent systems on quantum chemistry benchmarks and extends to conformer generation and MOF design, proving scalable automation beyond prompt-centric approaches.
llm-agentsscientific-automationknowledge-graphsquantum-chemistrytype-safetyexecution-graphsai-workflows
“Current LLM agentic approaches for scientific workflows rely on unstructured text, leading to fragile integration, obscured decision provenance, and poor auditability.”
paper / aspuruguzik / Feb 19
El Agente Sólido is a hierarchical multi-agent framework that uses LLMs to translate natural language objectives into end-to-end Quantum ESPRESSO workflows for solid-state simulations. It handles structure generation, input construction, execution, and analysis, integrating DFT, phonon calculations, and ML interatomic potentials. Benchmarking confirms reliable execution across diverse cases, boosting reproducibility in materials discovery.
llm-agentssolid-state-simulationsquantum-chemistrymaterials-discoveryquantum-espressodensity-functional-theory
“El Agente Sólido translates high-level natural language scientific objectives into complete computational pipelines for solid-state quantum chemistry.”
paper / aspuruguzik / Feb 4
El Agente Quntur is a hierarchical multi-agent AI system that acts as a research collaborator for quantum chemistry, automating ORCA 6.0 workflows via reasoning-driven decisions, composable actions, and guided deep research over documentation and literature. It eliminates hard-coded policies to enable adaptive planning, execution, and analysis of in silico experiments following best practices. The design principles generalize beyond ORCA to other quantum chemistry packages, addressing accessibility barriers for non-experts.
quantum-chemistrymulti-agent-airesearch-agentorca-softwareai-collaboratorcomputational-chemistry
“El Agente Quntur supports the full range of calculations available in ORCA 6.0”
paper / aspuruguzik / Feb 4
El Agente Estructural is a multimodal, natural-language-driven agent that manipulates molecular geometries in 3D using domain-informed tools and vision-language models, emulating human expert editing. It enables precise control over atomic replacements, connectivity, and stereochemistry without rebuilding core frameworks. Demonstrated in case studies including site-selective functionalization, ligand exchange, and image-guided structure generation from reaction schematics. Integrates into El Agente Quntur for enhanced autonomous quantum chemistry workflows.
molecular-modelingai-agentmultimodal-aichemistry-aistructural-editingquantum-chemistryarxiv-paper
“El Agente Estructural enables precise control over atomic or functional group replacements, atomic connectivity, and stereochemistry without rebuilding core molecular frameworks.”
paper / aspuruguzik / Jan 27
ELECTRAFI models periodic charge densities in crystals using anisotropic Gaussians in real space, leveraging closed-form Fourier transforms and Poisson summation for analytic plane-wave coefficients. This enables full density reconstruction via single inverse FFT, bypassing grid probing, periodic summation, and spherical harmonics. It matches or exceeds SOTA accuracy on benchmarks while being up to 633x faster, and cuts DFT initialization costs by ~20% due to low inference time.
electraficharge-density-predictioncrystalline-materialsgaussian-fourierperiodic-systemsml-for-materialsdft-acceleration
“ELECTRAFI reconstructs crystal charge densities in a fraction of a second”
paper / aspuruguzik / Jan 22
Materealize integrates structure generation, property prediction, synthesizability assessment, and synthesis planning into a unified multi-agent framework for end-to-end inorganic materials design. It offers an instant mode for rapid task resolution, such as property-conditioned candidate design with recipes and data augmentation, completing in minutes via natural-language interface. Thinking mode employs multi-agent debate for refined outputs, including reasoning-driven routes, mechanistic hypotheses validated against literature and simulations.
multi-agent-systemsmaterials-designmaterials-synthesisai-for-scienceproperty-predictionsynthesizabilityarxiv-paper
“Materealize orchestrates tools for structure generation, property prediction, synthesizability prediction, and synthesis planning in a single framework.”
paper / aspuruguzik / Jan 19
MATTERIX is a GPU-accelerated simulation framework that creates high-fidelity digital twins of chemistry laboratories, modeling robotic manipulation, powder/liquid dynamics, device functions, heat transfer, and basic reaction kinetics. It integrates realistic physics simulation, photorealistic rendering, and a modular semantics engine for logical states and continuous behaviors across abstraction levels. The framework supports open-source assets, hierarchical planning, and learning-based skills, enabling sim-to-real transfer and in silico testing of automated workflows to minimize physical experiments.
digital-twinrobotics-simulationchemistry-automationlab-automationmaterials-discoverysim-to-real
“MATTERIX simulates robotic physical manipulation, powder and liquid dynamics, device functionalities, heat transfer, and basic chemical reaction kinetics.”
paper / aspuruguzik / Jan 19
TreeWriter models documents as hierarchical trees, enabling multi-level outlining, iterative editing, and context-aware AI suggestions that load relevant content dynamically. A within-subject study (N=12) comparing it to Google Docs + Gemini demonstrated superior performance in idea exploration/development, AI helpfulness, and authorial control for long-document tasks. A two-month field deployment (N=8) confirmed its efficacy for collaborative writing via structured organization.
ai-assisted-writinghierarchical-planninglong-form-documentstreewriterhuman-computer-interactionai-co-writinguser-study
“TreeWriter represents documents as trees to support multi-level outlines and AI-integrated editing.”
paper / aspuruguzik / Jan 15
Discrete Feynman-Kac Correctors provide a framework for controlling the sampling distribution of trained discrete masked diffusion models at inference using Sequential Monte Carlo algorithms. These algorithms enable temperature annealing, sampling from products of marginals across multiple diffusion processes, and reward-tilted generation without retraining. Applications demonstrate improved efficiency in Ising model sampling, code generation with language models, amortized learning, and high-reward protein sequence design.
discrete-diffusion-modelsfeynman-kac-correctorssequential-monte-carloannealingreward-guided-samplingmachine-learningarxiv-paper
“Discrete Feynman-Kac Correctors control the generated distribution of discrete masked diffusion models at inference time via SMC algorithms”
paper / aspuruguzik / Dec 21
El Agente Cuántico is a multi-agent AI system that translates natural-language descriptions of scientific intent into executed, validated quantum simulations. It reasons over library documentation and APIs to dynamically assemble workflows across heterogeneous frameworks, covering state preparation, closed/open-system dynamics, tensor networks, quantum control, error correction, and resource estimation. This unifies disparate simulation paradigms under a single interface, reducing barriers and enabling scalable, autonomous exploration of quantum models.
quantum-simulationmulti-agent-aiai-automationquantum-computingnatural-language-interfacearxiv-paper
“Quantum simulation faces barriers from exponential Hilbert space growth and complex software tools”
paper / aspuruguzik / Dec 15
The method introduces a likelihood-free Bayesian optimization (BO) approach that skips explicit surrogate modeling and directly uses priors from general LLMs and chemistry foundation models to inform acquisition functions. It employs a tree-structured partition of the molecular search space with local acquisition functions, enabling efficient candidate selection through Monte Carlo Tree Search. Coarse-grained LLM-based clustering further boosts scalability by limiting evaluations to high-value clusters, yielding superior sample efficiency, robustness, and performance in low-data regimes.
bayesian-optimizationfoundation-modelsmolecular-discoveryllm-guided-bomonte-carlo-tree-searchchemistry-ai
“Likelihood-free BO bypasses explicit surrogate modeling by directly leveraging priors from LLMs and chemistry foundation models to inform acquisition functions.”
paper / aspuruguzik / Nov 5
EGMOF employs a hybrid diffusion-transformer architecture that decouples inverse design into property-to-descriptor mapping via a 1D diffusion model (Prop2Desc) and descriptor-to-MOF structure generation via a transformer (Desc2MOF). This modularity minimizes retraining needs and sustains high performance with limited data, such as 1,000 samples. It achieves superior validity (95%) and hit rates (84%) on hydrogen uptake tasks, outperforming baselines by up to 57% and 14%, and generalizes across 29 diverse property datasets.
mof-generationdiffusion-modeltransformer-architectureinverse-designmaterials-discoverydata-efficient-ml
“EGMOF achieves over 95% validity and 84% hit rate on a hydrogen uptake dataset”
paper / aspuruguzik / Oct 14
SA-ICL introduces a framework that distills prior demonstration examples into lightweight, structured schemas—templates of key inferential steps and relationships—for explicit knowledge transfer in transformer-based LLMs. Drawing from cognitive schema theory, it augments reasoning on novel tasks, reducing dependence on demonstration volume. Experiments on GPQA chemistry/physics questions show up to 36.19% gains with single high-quality examples across LLMs, confirming they lack implicit schema formation but thrive with explicit scaffolding. This unifies ICL variants like pattern priming and CoT while enhancing interpretability.
in-context-learningschema-theoryllm-reasoningcognitive-sciencegpqa-benchmarkartificial-intelligencemachine-learning
“LLMs lack capacity to form and utilize internal schema-based learning representations implicitly”
paper / aspuruguzik / Oct 8
Certain Lindbladian processes perform QPE-type tasks at standard quantum limit scaling, unlike Heisenberg-limit QPE. These dynamics permit quadratic fast-forwarding, simulated in O(√(t log(1/ε))) cost for time t and error ε, via a mechanism distinct from Hamiltonian fast-forwarding. This simulation doubles as a new Heisenberg-limit QPE algorithm and extends to efficient Gibbs state preparation with accelerated decoherence under Pauli noise.
quantum-phase-estimationlindbladiansfast-forwardingquantum-simulationdissipative-dynamicsgibbs-state-preparationquantum-algorithms
“Simple Lindbladian processes can be adapted to perform quantum phase estimation (QPE)-type tasks”
paper / aspuruguzik / Oct 8
RAISE is a robotic self-driving laboratory that automates liquid formulation mixing, droplet deposition, contact angle imaging, and measurement at 1 per minute throughput. It integrates Bayesian optimization for iterative formulation exploration targeting user-defined wettability objectives. Multi-objective BO uses desirability scores to balance contact angle precision, surfactant minimization, and cost reduction, demonstrating robustness to surfactant purity variations.
self-driving-labbayesian-optimizationcontact-anglesurface-wettabilityrobotic-automationformulation-discoveryhigh-throughput-experiment
“RAISE achieves a contact angle measurement rate of approximately 1 per minute.”
paper / aspuruguzik / Sep 25
HIP predicts molecular Hessians directly from SE(3)-equivariant irreducible representations up to degree l=2 during graph neural network message passing, bypassing automatic differentiation or finite differences. This yields 1-2 orders of magnitude speedup, higher accuracy, lower memory use, easier training, and better scaling with system size compared to traditional methods. Validation across transition state search, geometry optimization, zero-point energy corrections, and vibrational analysis shows superior performance; code and models are open-sourced.
hessian-potentialsgraph-neural-networkscomputational-chemistrymolecular-dynamicsse3-equivariantmachine-learning-physics
“Hessians can be constructed directly from SE(3)-equivariant, symmetric features using irreducible representations up to degree l=2 in graph neural networks”
paper / aspuruguzik / Sep 10
Recasting equivariant ML force fields as deep equilibrium models (DEQs) exploits temporal continuity in molecular dynamics, recycling intermediate features from prior timesteps. This yields 10-20% gains in accuracy and speed over baseline models on MD17, MD22, and OC20 200k datasets. DEQ training is more memory-efficient, enabling expressive models on larger systems.
deep-equilibrium-modelsmachine-learning-force-fieldsequivariant-modelsmolecular-dynamicsneural-networksscientific-computing
“DEQ formulation of equivariant base model improves accuracy and speed by 10%-20% on MD17, MD22, and OC20 200k”
paper / aspuruguzik / Aug 26
A dynamical system of sticks and springs approximates continuous functions via piecewise-linear stick configurations, with spring potential energy encoding MSE loss minimized through dissipation. Applied to regression, it matches multi-layer perceptron performance. Free energy changes relate to learning data distributions, but environmental fluctuations impose a thermodynamic learning barrier limiting free energy reduction and thus learning capability.
physical-learningsprings-sticksdynamical-systemsthermodynamic-learningfunction-approximationml-regressionarxiv-paper
“The springs-and-sticks system can arbitrarily approximate any continuous function.”
paper / aspuruguzik / Aug 15
Generative AI models including VAEs, diffusion models, and LLM-based agents leverage growing MOF datasets to propose novel porous reticular structures. These tools integrate with high-throughput screening and automated experiments in closed-loop pipelines, accelerating discovery for clean air and energy applications. Challenges persist in synthetic feasibility, dataset diversity, and domain knowledge integration.
generative-aimetal-organic-frameworksmof-designdeep-learningmaterials-discoveryreticular-chemistryai-drug-discovery
“Generative AI enables autonomous proposal and laboratory synthesis of new MOF structures on demand”
paper / aspuruguzik / Jul 25
TreeReader decomposes academic papers into an interactive tree structure, with LLM-generated concise summaries for each section and on-demand access to underlying details. This addresses cognitive overload from linear formats like PDF/HTML and limitations of LLM chatbots, such as poor sectional nuance and lack of navigation. A user study confirms improved reading efficiency and comprehension through focused exploration and source verification.
tree-readeracademic-readinghierarchical-summarizationllm-augmentedhciuser-studyarxiv-paper
“Traditional linear paper formats like PDF and HTML cause cognitive overload and obscure hierarchical structure.”
paper / aspuruguzik / Jul 3
SynTwins introduces a deterministic, retrosynthesis-guided framework that generates synthetically accessible molecular analogs by performing retrosynthesis on a target, searching for similar building blocks, and executing virtual forward synthesis. Unlike stochastic ML generators, it uses search algorithms to outperform SOTA models in producing feasible analogs with high structural similarity to targets. Integration into property-optimization pipelines yields feasible molecules with minimal property degradation, validated across diverse datasets.
retrosynthesismolecular-designsynthetic-feasibilityai-chemistrygenerative-aidrug-discoverymolecular-analogs
“SynTwins emulates expert chemists via three-step process: retrosynthesis, similar building block search, and virtual synthesis”
paper / aspuruguzik / Jun 26
The method enforces arbitrary boundary conditions in quantum algorithms for differential equations by augmenting the governing equations with penalty projections. Assuming a fast-forwardable projection representation, the gate complexity overhead scales as O(log λ) with penalty strength λ, or worst-case O((‖v(0)‖²‖A₀‖ + ‖b‖_{L¹[0;t]}^2) t² / ε) for precision ε in systems dv/dt = A₀(t) v(t) + b(t) with negative semidefinite A₀. For the heat equation, this yields Õ(d log n + log t) gate complexity. Constraint error bounds are proven, with numerical validation and circuit estimates via linear combination of Hamiltonian simulation.
quantum-algorithmsdifferential-equationsboundary-conditionspenalty-projectionquantum-simulationarxiv-paper
“Penalty projection method enforces arbitrary boundary conditions with gate complexity overhead O(log λ) assuming fast-forwardable projection.”
paper / aspuruguzik / Jun 5
QC-Daemon, a transformer-based reinforcement learning agent, compiles quantum circuits by solving the Atom Game for reconfigurable neutral atom arrays, optimizing atom layouts for parallel circuit execution. Trained on diverse circuits with physically motivated architectures, it reduces logarithmic infidelity on benchmarks up to 100 qubits. The approach demonstrates transferability, generalizing to unseen circuits without retraining.
quantum-circuitsreinforcement-learningquantum-compilationneutral-atom-arraystransformer-modelsquantum-hardware
“QC-Daemon reduces logarithmic infidelity for benchmark problems up to 100 qubits by intelligently changing atom layouts.”
paper / aspuruguzik / May 20
RoboCulture is a flexible, low-cost platform using a general-purpose robotic manipulator to automate biological workflows, addressing limitations of current liquid handlers that require human intervention for plate loading, tip replacement, and calibration. It integrates liquid handling, lab equipment interaction, and computer vision for real-time optical density-based growth monitoring with force feedback. The system employs a modular behavior tree framework to robustly execute a fully autonomous 15-hour yeast culture experiment.
roboticsautomationbiological-experimentationlab-automationcomputer-visionautonomous-robotsyeast-culture
“Current liquid handlers require human intervention for plate loading, tip replacement, and calibration.”
paper / aspuruguzik / May 20
Quetzal is a scalable autoregressive model that generates 3D molecules atom-by-atom using a causal transformer for discrete atom types and a diffusion MLP for continuous positions. It surpasses existing autoregressive baselines in generation quality, matches state-of-the-art diffusion models, and enables faster generation via fewer transformer passes and exact likelihood computation. The architecture natively supports variable-size tasks like hydrogen decoration without modifications.
3d-molecule-generationautoregressive-modelsgenerative-modelsmolecular-diffusionmachine-learningchemical-physicsscalable-generation
“Quetzal achieves substantial improvements in generation quality over existing autoregressive baselines.”
paper / aspuruguzik / May 5
El Agente Q is an LLM-based multi-agent system that interprets natural language prompts to autonomously generate, execute, and debug quantum chemistry workflows using a hierarchical memory architecture for task decomposition, tool selection, and file management. Benchmarked on six university-level exercises and two case studies, it achieves over 87% average task success with adaptive in situ error handling and support for multi-step executions. This enables accessible quantum chemistry for non-specialists while providing transparent action logs for experts.
autonomous-agentsquantum-chemistryllm-agentsmulti-agent-systemscomputational-chemistryai-for-science
“El Agente Q dynamically generates and executes quantum chemistry workflows from natural language prompts”
paper / aspuruguzik / Mar 11
ELECTRA is an equivariant Cartesian tensor network that predicts 3D electronic charge densities using floating Gaussian orbitals, with positions and coefficients learned data-driven via a symmetry-breaking mechanism preserving rotation equivariance. It outperforms prior methods on benchmarks by balancing computational efficiency and accuracy. Initializing DFT calculations with ELECTRA densities reduces SCF iterations by 50.72% on average for unseen molecules.
electra-modelfloating-orbitalscharge-density-predictionequivariant-networkquantum-chemistrygaussian-splattingdft-acceleration
“ELECTRA uses floating orbitals placed freely in space, not centered on atoms, for more compact charge density representations.”
paper / aspuruguzik / Mar 4
Feynman-Kac Correctors (FKCs) provide an efficient, principled method for sampling from annealed, geometrically averaged, or product distributions using pretrained score-based models, derived from the Feynman-Kac formula and PDEs. Sequential Monte Carlo resampling with inference-time scaling simulates these distributions accurately, addressing limitations of heuristic classifier-free guidance that fails to approximate intermediate distributions. Empirical applications include amortized temperature annealing, multi-objective molecule generation, and enhanced text-to-image synthesis.
score-based-modelsfeynman-kac-correctorsdiffusion-modelsinference-guidancesequential-monte-carlogenerative-modelsmolecule-generation
“FKCs enable sampling from sequences of annealed, geometric-averaged, or product distributions derived from pretrained score-based models.”
paper / aspuruguzik / Feb 26
Formulating chemical reaction condition optimization as Bayesian optimization over curried functions enables discovery of general parameters that perform well across multiple related tasks. Simple myopic strategies, decoupling parameter and task selection, match complex methods by effectively exploring both spaces. Evaluations on real-world data confirm accelerated identification of transferable optima applicable to unseen reactions.
bayesian-optimizationchemical-reactionsgeneral-optimizationcurried-functionsexperiment-planningmachine-learning
“General parameters for chemical reactions can be found via Bayesian optimization over curried functions”
paper / aspuruguzik / Feb 6
AnyPlace is a two-stage method that uses a VLM to identify rough placement regions, enabling efficient training of a local pose-prediction model on synthetic data for diverse configurations like insertion, stacking, and hanging. Trained solely on randomly generated synthetic objects, it outperforms baselines in simulation on success rate, placement mode coverage, and precision. The model transfers zero-shot to real-world tasks, handling varying geometries and fine placements where others fail.
robot-manipulationobject-placementvision-language-modelssynthetic-datasim-to-realrobotic-tasks
“AnyPlace is trained entirely on synthetic data featuring randomly generated objects in insertion, stacking, and hanging configurations.”
paper / aspuruguzik / Jan 30
Recent ML models show simple representations like chemical formulas rival structure-informed ones for materials property prediction, contradicting physics intuition of incompleteness. This paper introduces a tomographic interpretation using information theory to define materials, representations, properties, and their relations. It validates the framework via exhaustive comparisons of property-augmented representations across prediction tasks, revealing how properties encode complementary structural information.
materials-sciencemachine-learningstructure-property-relationsmaterials-representationsinformation-theoryproperty-prediction
“Simple materials representations like chemical formulas can achieve competitive property prediction performance without structural information.”
paper / aspuruguzik / Jan 28
The conditional Generative Quantum Eigensolver (conditional-GQE) employs an encoder-decoder Transformer to generate context-aware quantum circuits for combinatorial optimization problems. Trained on problems up to 10 qubits, it achieves nearly perfect performance on unseen instances via high expressiveness of classical generative models and efficient preference-based training. This hybrid quantum-classical framework enhances scalability and generalizability, bridging current hardware limitations toward fault-tolerant quantum computing.
quantum-computingcombinatorial-optimizationgenerative-quantum-eigensolverhybrid-quantum-classicalquantum-circuitsmachine-learningtransformers
“conditional-GQE is a context-aware quantum circuit generator using an encoder-decoder Transformer.”
paper / aspuruguzik / Dec 17
Stiefel Flow Matching generates 3D molecular structures on the Stiefel manifold St(n,4), embedding point clouds with fixed moments of inertia for exact constraint enforcement. It outperforms Euclidean diffusion models by leveraging high-precision rotational spectroscopy data, achieving higher success rates and faster sampling. Equivariant optimal transport approximations yield simpler, shorter flows, scaling to large molecules in the GEOM dataset.
stiefel-manifoldflow-matchingstructure-elucidationmolecular-dynamicsgenerative-modelsrotational-spectroscopyoptimal-transport
“The space of n-atom point clouds with fixed moments of inertia embeds in the Stiefel manifold St(n,4)”
paper / aspuruguzik / Dec 10
The k-agents framework uses LLM-based agents to structure unstructured laboratory knowledge and automate multi-step experiments via state machines with closed-loop feedback. Execution agents decompose procedures, interact to perform steps, analyze results, and drive state transitions. Demonstrated on a superconducting quantum processor, agents autonomously calibrated, operated for hours, and produced entangled states equivalent to human experts.
self-driving-labsai-agentsquantum-computingautomated-experimentsllm-agentslaboratory-automationquantum-processors
“k-agents framework integrates unstructured, multimodal laboratory knowledge using LLM-based agents”
paper / aspuruguzik / Nov 20
Presents a quantum algorithm using product formulas and Trotterization for real-space simulation of vibronic dynamics under general Hamiltonians with arbitrary electronic states and vibrational modes. Introduces first Trotterization beyond two states plus optimized exponentiation, yielding low gate costs. Demonstrates feasibility with 154 qubits/2.76e6 Toffolis for 100 fs on 6-state/21-mode anthracene dimer exciton model, and 1053 qubits/2.66e7 Toffolis for 4-state/246-mode anthracene-fullerene charge transfer.
quantum-algorithmsvibronic-dynamicstrotterizationsinglet-fissionsolar-cellsquantum-simulationmaterials-discovery
“Quantum algorithm simulates time evolution under general vibronic Hamiltonian in real space for arbitrary electronic states and modes”
paper / aspuruguzik / Nov 14
AI's data-driven learning addresses quantum computing's counterintuitive high-dimensional mathematics, targeting key scaling hurdles across the QC stack. State-of-the-art AI techniques already enhance device design, software development, and applications in QC. Future progress hinges on cross-pollinating expertise from AI and QC domains.
ai-for-quantum-computingquantum-computingai-applicationsquantum-hardwarequantum-softwarearxiv-paper
“AI advancements have revolutionary impact on science and engineering, including quantum computing”
paper / aspuruguzik / Nov 4
Quantum linear systems solvers originated with HHL but suffered from limitations in runtime, precision, and dependence on expensive quantum methods. Post-HHL advancements optimize efficiency via diverse paradigms, achieving optimal lower bounds relative to error tolerance and condition number, as categorized in a proposed taxonomy. These developments draw from foundational quantum computing work and enable applications in differential equations, quantum machine learning, and many-body physics.
quantum-algorithmslinear-systemshhl-algorithmquantum-computingquantum-surveyquantum-applications
“HHL solver relies on computationally expensive quantum methods”
paper / aspuruguzik / Oct 31
Quantum Deep Equilibrium Models (QDEQs) adapt deep equilibrium models to parametrized quantum circuits (PQCs), using root solvers to find fixed points that emulate infinite-depth networks with reduced circuit depth and parameters. This mitigates error accumulation and measurement overhead in variational quantum algorithms. Experiments on MNIST-4 (4 qubits), full MNIST, FashionMNIST, and CIFAR-10 show QDEQs matching or exceeding baselines, including outperforming networks with 5x more layers using shallower circuits.
quantum-machine-learningdeep-equilibrium-modelsvariational-quantum-algorithmsparametrized-quantum-circuitsquantum-neural-networksarxiv-paper
“QDEQs are the first application of deep equilibrium models to quantum machine learning with PQCs.”
paper / aspuruguzik / Oct 11
Rank-based Bayesian Optimization (RBO) replaces regression surrogates with ranking models, prioritizing relative ordering over absolute property predictions for molecule selection. Evaluated on chemical datasets, RBO matches or exceeds standard BO performance, especially on rough structure-property landscapes with activity cliffs. Ranking ability strongly correlates with optimization success and persists early in BO iterations, positioning RBO as a robust alternative for novel compound optimization.
bayesian-optimizationranking-modelmolecule-selectionchemical-optimizationmachine-learningself-driving-labsactivity-cliffs
“RBO achieves similar or improved optimization performance compared to conventional regression-based BO on various chemical datasets.”
paper / aspuruguzik / Oct 10
The method reformulates Doob's h-transform for Brownian dynamics as a variational optimization over trajectories from initial to rare endpoint states, bypassing exponential trajectory sampling needs. It uses a simulation-free training objective with boundary-conditioned parameterization to directly optimize path likelihoods. This reduces search spaces dramatically and eliminates inefficient importance sampling, succeeding on molecular simulations and protein folding.
doobs-h-transformrare-event-samplingtransition-path-samplingvariational-methodsmolecular-simulationprotein-folding
“Doob's h-transform definitively solves conditioning Brownian motion with known drift to reach a specific endpoint.”
paper / aspuruguzik / Oct 5
Symmetry-cloning trains general architectures like MLPs to induce group equivariance by supervised learning from equivariant models. This approach captures the inductive bias of group-equivariant architectures without hardwiring specific symmetries or basis functions. Learned symmetries can be retained or selectively broken for downstream tasks, avoiding over-engineered relaxations in symmetry-breaking scenarios.
group-equivariancesymmetry-cloningequivariant-modelssupervised-learningmachine-learningarxiv-paperinductive-bias
“Symmetry-cloning induces equivariance in machine learning models”
youtube / aspuruguzik / Sep 24
AI-guided self-driving laboratories are revolutionizing discovery in chemistry and materials science. This involves using machine learning for reaction optimization, property prediction, and inverse design, with a focus on strategic data acquisition and robust representation to accelerate the development of new materials and chemical processes. The field emphasizes the importance of multidisciplinary collaboration and a critical evaluation of research impact.
ai-materials-sciencechemistry-ai-applicationsself-driving-labsai-research-strategyscientific-discovery-automationmachine-learning-representationsrobotics-in-science
“Machine learning for chemistry and Material Science requires task-specific models due to the multiscale and complex nature of the disciplines.”
paper / aspuruguzik / Sep 16
Machine learning is extensively applied across diverse chemistry and materials science problems but remains immature and underutilizes its potential. The perspective outlines existing ML applications, examines how ML researchers approach chemistry challenges, and offers targeted recommendations for high-impact research. It aims to bridge ML techniques with domain-specific needs to accelerate maturity in scientific applications.
machine-learningchemistrymaterials-scienceai-researchimpactful-researcharxiv-paper
“Machine learning has pervasively influenced chemistry and materials science fields.”
paper / aspuruguzik / Sep 12
Corrected product formulas (CPFs) enhance standard product formulas for Hamiltonian simulation by inserting auxiliary corrector terms, significantly improving accuracy for Hamiltonians split into two exactly simulable partitions, as in lattice models. CPFs add only a small factor to simulation cost while leveraging perturbation norms for better error control, especially in perturbed systems. Numerical simulations and quantum hardware tests confirm CPFs match or outperform theoretical bounds against standard methods, aiding resource-limited fault-tolerant quantum computing and classical simulations.
quantum-simulationproduct-formulashamiltonian-simulationcorrected-product-formulaslattice-hamiltoniansquantum-algorithmsfault-tolerant-quantum
“CPFs improve accuracy of product formulas for Hamiltonians with two exactly simulable partitions.”
paper / aspuruguzik / Aug 16
Understanding in AI is defined as the composability of relevant inputs into verifier-satisfactory outputs, applicable to AIs, animals, and institutions. Catalysts are inputs that enhance composition quality, with a subject's structure revealed by its catalytic components. Learning equates to composing inputs into inner catalysts; language models' ability to generate self-catalysts lays groundwork for general intelligence by overcoming understanding limits.
ai-understandingcomposabilitycatalystsai-learninggeneral-intelligenceai-theory
“Understanding of an object by any subject is its ability to compose relevant inputs into satisfactory outputs as judged by a verifier.”
paper / aspuruguzik / Jun 23
Integrating chemistry-aware LLMs into evolutionary algorithms (EAs) redesigns mutation and crossover operations to leverage vast chemical corpora, reducing reliance on random changes and expensive evaluations. Empirical tests across property optimization, molecular rediscovery, and structure-based drug design show superior performance over baselines in single- and multi-objective scenarios using commercial and open-source models. The approach boosts solution quality, convergence speed, and efficiency by minimizing objective function calls.
llm-chemical-searchevolutionary-algorithmsmolecular-optimizationdrug-designchemical-spaceai-chemistry
“LLM-redesigned crossover and mutation operations in EAs outperform standard random EAs in molecular discovery tasks”
paper / aspuruguzik / Jun 10
This study assesses fault-tolerant quantum computers for accelerating homogeneous catalyst discovery in nitrogen fixation by evaluating ground-state energy estimation problems across economic utility, classical hardness, and quantum resources. For the highest utility case—two steps in cyanate anion generation from dinitrogen—quantum computation yields $200,000 economic value, requiring 139,000 QPU-hours on superconducting devices via double-factorized phase estimation. An equivalent DMRG classical calculation demands ~400,000 CPU-hours, indicating quantum advantage potential with ongoing hardware advances.
quantum-computingfault-tolerant-quantumhomogeneous-catalystsnitrogen-fixationcatalyst-discoveryquantum-chemistryarxiv-paper
“Development of new homogeneous catalysts could greatly improve efficiency of chemical production, particularly for nitrogen fixation.”
paper / aspuruguzik / Mar 26
Application-driven ML research, inspired by real-world challenges, is under-valued compared to methods-driven approaches despite its potential for domain-specific impact and algorithmic advancements. This paradigm synergizes with methods-driven work by generating novel algorithms from practical needs. Current reviewing, hiring, and teaching practices hinder it, requiring reforms for broader innovation.
application-driven-researchmachine-learning-innovationmethods-driven-researchml-community-practicesresearch-paradigmsicml-paper
“Application-driven research has been systemically under-valued in the machine learning community”