Technical Terms

RLHF

Definition

Reinforcement Learning from Human Feedback, a method for improving model behaviour using human preference data.

In Plain English

Training the model based on which answers people preferred.

Related Terms