absorb.md

Seqvcr Preventing Collapse In Intermediate Transformer Representations For Enhanced Reasoning

1 mentions across 1 person

Visit ↗
Yann LeCun
paper · 2024-11-04
Recommended

In this work, we identify representation collapse in the model's intermediate layers as a key factor limiting their reasoning capabilities. To address this, we propose Sequential Variance-Covariance Regularization (Seq-VCR), which enhances the entropy of intermediate representations and prevents collapse.

Seq-VCR: Regularization for Enhanced Transformer Reasoning