Hacker News new | past | comments | ask | show | jobs | submit login

If you continue reading that Wikipedia article, you'll reach this point:

> A second-order Markov chain can be introduced by considering the current state and also the previous state, as indicated in the second table.

i.e., a higher-order Markov chain can depend on several of the previous states.

So, if a certain transformer model accepts up to 20k tokens as input, it can certainly be seen as a 20000'th order Markov chain process (whether it is useful to do so or not can be debated, but not the fact that it can be seen as such, since it complies with the definition of a Markov chain).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: