🛠 tool

Llama321b

1 mentions across 0 people

All mentions

Unknown speaker

paper · 2026-04-10

Recommended

“Three small frozen models (Llama-3.2-1B, Qwen2.5-1.5B, Gemma-2-2B) encode the input into a shared latent space whose aggregate signal is injected into two larger frozen models (Phi-3-mini, Mistral-7B), whose representations feed a lightweight cross-attention output node.”

Feedforward Graphs Enable Collaborative LLM Reasoning with Minimal Training ↗