> It's also interesting how the result heavily depends on the path taking during...

> It's also interesting how the result heavily depends on the path taking during the gradient descent rather than merely the end result. This is somewhat counterintuitive when for the equivalent gradient descent only the end result is seemingly important.

I don’t understand. What do you mean here?

I only (quickly) skimmed the article but it says that the limit is equal to a (not the) zero of your gradient.

The result should be the same for every trajectory leading to the same zero. Put another way, it doesn’t depend on the path if it leads to the same zero.

I would say here that not all zeros are equal though. :)