Hacker News new | past | comments | ask | show | jobs | submit login

This confuses me. You have your model, you have your tokens.

If the tokens are bit-for-bit-identical, where does the non-determinism come in?

If the tokens are only roughly-the-same-thing-to-a-human, sure I guess, but convergence on roughly the same output for roughly the same input should be inherently a goal of LLM development.




Most any LLM has a "temperature" setting, a set of randomness added to the otherwise fixed weights to intentionally cause exactly this nondeterministic behavior. Good for creative tasks, bad for repeatability. If you're running one of the open models, set the temperature down to 0 and it suddenly becomes perfectly consistent.


You can get deterministic output with even with a high temp.

Whatever "random" seed was used can be reused.


The model outputs probabilities, which you have to sample randomly. Choosing the "highest" probability every time leads to poor results in practice, such as the model tending to repeat itself. It's a sort of Monte-Carlo approach.


The trained model is just a bunch of statistics. To use those statistics to generate text you need to "sample" from the model. If you always sampled by taking the model's #1 token prediction that would be deterministic, but more commonly a random top-K or top-p token selection is made, which is where the randomness comes in.


It is technically possible to make it fully deterministic if you have a complete control over the model, quantization and sampling processes. The GP probably meant to say that most commercially available LLM services don't usually give such control.


Actually you just have to set temperature to zero.


> If the tokens are bit-for-bit-identical, where does the non-determinism come in?

By design, most LLM’s have a randomization factor to their model. Some use the concept of “temperature” which makes them randomly choose the 2nd or 3rd highest ranked next token, the higher the temperature the more often/lower they pick a non-best next token. OpenAI described this in their papers around the GPT-2 timeframe IIRC.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: