As the author points out in the conclusion, the state space blows up very quickly as the grid becomes larger.
There is a large class of algorithms for finding approximately optimal solutions to MDPs[1] that are model-free or stateless, meaning you don't need to enumerate all of the state-to-state transitions to get a good policy.
If you google 2048 reinforcement learning[0], you'll find lots of implementations of such algorithms.
What a coincidence. I was just reading a blog about concrete vs abstract interpretations of chess (and how to deal with its massive state space through abstract representation) this morning: http://www.msreverseengineering.com/blog/2018/2/26/concrete-...
There is a large class of algorithms for finding approximately optimal solutions to MDPs[1] that are model-free or stateless, meaning you don't need to enumerate all of the state-to-state transitions to get a good policy.
If you google 2048 reinforcement learning[0], you'll find lots of implementations of such algorithms.
[0] https://www.google.com/search?q=2048+reinforcement+learning
[1] https://en.wikipedia.org/wiki/Markov_decision_process#Algori...