Llama321b
1 mentions across 0 people
All mentions
Unknown speaker
Recommendedpaper · 2026-04-10
“Three small frozen models (Llama-3.2-1B, Qwen2.5-1.5B, Gemma-2-2B) encode the input into a shared latent space whose aggregate signal is injected into two larger frozen models (Phi-3-mini, Mistral-7B), whose representations feed a lightweight cross-attention output node.”
Feedforward Graphs Enable Collaborative LLM Reasoning with Minimal Training ↗