Yan
1 mentions across 0 people
All mentions
Unknown speaker
Recommendedpaper · 2026-04-17
“Building on MoE-FM, we develop a non-autoregressive (NAR) language modeling approach, named YAN, instantiated with both Transformer and Mamba architectures.”
MoE-FM Boosts Language Model Inference Speed and Quality ↗