Parameters & Settings
TTFT (Time to First Token)
Definition
The delay between sending a request and receiving the first generated token back from the model.
In Plain English
How long you wait before the model starts answering at all.
The delay between sending a request and receiving the first generated token back from the model.
How long you wait before the model starts answering at all.