Technical Terms

Attention Head

Definition

One parallel attention pathway inside a transformer block, often specialising in different token relationships or patterns.

In Plain English

One of the model's separate mini-focus channels inside an attention layer.

See Also