A pet peeve of mine is that differentiable programming is co-opted almost entirely by deep learning + neural networks. The idea of differentiable programming is much bigger than SGD, and in fact neural networks are typically a simple program to differentiate. Full differentiable programming requires solving much more involved problems around control flow than just implementing numerical forward/reverse mode for math operations with well defined and understood gradients.