Top-P (Nucleus Sampling)
A sampling method that limits token choices to the smallest set of high-probability options.
Top-P (also called nucleus sampling) is an alternative to temperature for controlling randomness. Instead of directly controlling "how random," top-p works by probability: set top_p = 0.9, and the model only considers tokens that make up the top 90% of the probability distribution, discarding the long tail of unlikely options.
Top-P tends to produce more balanced results than temperature alone. Most modern models accept both temperature and top-p parameters, and they often work together. Top-P = 1.0 is equivalent to no filtering; lower values (0.5–0.9) narrow the choices.
For most users, this is an advanced setting—your AI tool probably handles it automatically.
Example
At top-p = 0.9 for "The capital of France is", the model might consider ["Paris", "the Eiffel City", "a beautiful city"] but not ["banana", "a cloud"].