Do Transformers Need Three Projections? Systematic Study of QKV Variants?
BrainChip researchers Anusha Madan Gopal and Ali Kayyam, along with M. Anthony Lewis, challenge a foundational assumption in transformer design — that three separate query, key, and value projections are always necessary. Their ICML 2026 study finds that sharing projections can match standard transformer performance while cutting KV cache requirements by up to 96.9%, a meaningful efficiency gain for on-device AI inference.













