paper / yulunwang / 2d ago
A hybrid quantum-classical pipeline integrates custom ansatz, dual variational updates, parametric compilation, hardware error suppression, and O(n) post-processing to solve unconstrained binary combinatorial optimization on gate-model quantum hardware. Without these components, outputs match random sampling, underscoring their necessity. On IBM devices, it achieves 100% approximation ratios for Max-Cut on 3-regular graphs up to 156 qubits and ≥99.5% for 156-qubit spin-glass ground states, outperforming classical local solvers.
quantum-computingcombinatorial-optimizationerror-suppressionvariational-algorithmsquantum-hardware
“Standard circuit execution without the integrated pipeline produces output indistinguishable from random sampling at scale.”
paper / yulunwang / 2d ago
The NCCLX framework addresses the communication bottlenecks for LLM training and inference on GPU clusters exceeding 100,000 GPUs. It optimizes for both high-throughput synchronous training and low-latency inference demands. This solution facilitates operation of next-generation LLMs at unprecedented scales.
collective-communicationdistributed-systemsgpu-clusterslarge-language-modelsllm-infrastructurehigh-performance-computingnetworking
“Traditional communication methods limit large language model (LLM) scaling due to throughput and latency issues.”