Hacker News new | past | comments | ask | show | jobs | submit login

You need to think of a derivative as just a function that takes a function, a point, and returns a linearized function that best approximates the original function at that point.

So basically the chain rule states that if you have two functions F, G composed together and you want to find the derivative (the linearized approximation), you simply compute the linearized function for both F and G, and compose the linearized version afterwards.

You can read the Wikipedia article on the chain rule and it eventually gives this simple and elegant formula for the chain rule: https://wikimedia.org/api/rest_v1/media/math/render/svg/8b0f...

But the unfortunate reality is that too many textbooks formulate the chain rule in such a complicated manner that it obscures the simplicity and elegance of the chain rule.




Thank you for the explanations. I agree with you that this more abstract view of the chain rule (and the derivative itself) is superior to the sum-of-products formula one usually sees in a first course in multivariable calculus, but I feel most students have to learn the complicated, technical version first before they can see the beauty of the more abstract one.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: