Llama 2
2 mentions across 2 people
Visit ↗All mentions
“Llama 2 performance apparently scales linearly at least as far as 32 chips which at peak can generate almost 2,000 tokens per second.”
Emerging AI Inference Accelerators: A Landscape of Specialization ↗“He and Meta AI have been big proponents of open sourcing AI development, and have been walking the walk by open sourcing many of their biggest models, including LLaMA 2 and eventually LLaMA 3.”
Beyond Autoregressive LLMs: The Case for World Models and Joint Embedding ↗
