> The parameters θ of the deep neural network in AlphaZero are trained by self-p...

> The parameters θ of the deep neural network in AlphaZero are trained by self-play reinforcement learning, starting from randomly initialised parameters θ. Games are played by selecting moves for both players by MCTS, at ∼ πππt. At the end of the game, the terminal position sT is scored according to the rules of the game to compute the game outcome z: −1 for a loss, 0 for a draw, and +1 for a win.

In this case sounds like better algorithm + a lot of self-generated data (but no prior knowledge other than rules).