Yes, the training doesn't encourage this. It encourages guessing, because if it ...

Yes, the training doesn't encourage this. It encourages guessing, because if it guesses the next word and it's right, the guessing is reinforced.

Whenever the model gets something right, it's the result of good guesses that were reinforced. It's all guesswork, it's just that some guesses are right.