absorb.md

Nexus Optimizer

1 mentions across 0 people

Unknown speaker
paper · 2026-04-13
Recommended

To address this, we propose the Nexus optimizer, which encourages the closeness of these minima by maximizing gradient similarity during optimization. Experiments across models ranging from 130M to 3B parameters, various data mixtures and hyperparameter schedules, show that Nexus significantly boosts downstream performance

Nexus Optimizer Aligns Task Minima for Superior Downstream Generalization at Ide