You've just described the dual numbers, which provide a way of implementing forward-mode autodiff.
> If the derivative is abnormal in a way such that the normal
> truncated calculative formulas would conceal it, then AD
> would fail to give an accurate result as well.
What I'm seeing directly contradicts that. I've tested the example above (the one I called f(x,a) with a=1e-36, trying to find its derivative at x=0), and it gave the right answer with dual numbers, but not with finite differencing. So what you're saying isn't true. I'll post a minimal implementation here later.
[edit]
I've used Sympy out of laziness, which is inelegant. I don't know how clear this will be.
First of all, I have to change the number 1 in the definition of f(x,a) into a matrix. We need this because we'll be representing dual numbers as matrices, and you can't add the scalar 1 to a matrix:
one = eye(2)
def f(x, a=1e-36):
return one + a * sin(x / a)
Sympy doesn't have its own implementation of `sin`, so we'll need to provide one:
def sin(M):
return im(exp(I*M).n())
At the dual number ε, we have f(ε) = f(0) + ε f'(0). The dual number ε will be represented as the matrix
⎡0 1⎤
⎢ ⎥
⎣0 0⎦
As code:
epsilon = Matrix([[0,1],[0,0]])
Evaluating f on the above matrix gives:
In: f(epsilon)
Out:
⎡1 1.0⎤
⎢ ⎥
⎣0 1 ⎦
Which tells us that f(0) ≈ 1 and f'(0) ≈ 1. Both are correct.
So it's claiming incorrectly that the derivative at 0 is 0. No finite differencing scheme can fix this because floating point causes the function f to become constant.
Autodiff using the dual numbers has pulled off the seemingly impossible.
[edit: An early typo caused the code to produce an incorrect result. Now fixed, and my claim holds.]
> Autodiff using the dual numbers has pulled off the seemingly impossible.
It's not seemingly impossible, if you understand that the chain rule for f' is just being executed at the same time as the calculation of f by having derivatives for basic operations already defined.
However, like I said, if you have a calcuation method that hides derivatives in terms that have been truncated, then this will not save you.
(-1)*i * x*(2*i + 1) / factorial(i) is not one of those methods -- and sin is rather regular in regard to its Taylor series. So of course it works out in this case, with a normal power series calculation method.
Try a different series or calculation method, and dual numbers will get you a wildly different result. Understand, dual numbers only work well, when you use a method of calculation that front-loads terms that have high bearing on the derivative. Otherwise the missing terms/truncation, causes severe inaccuracy.
However, stenciling might actually perform better in these scenarios.
> Try a different series or calculation method, and dual numbers will get you a wildly different result
No. The example works because:
While the exact value of f(x,a) isn't 1, given any inexact representation of real numbers like floating point or fixed point, the value of "a" can be chosen so that f(x,a) has 1 as its closest representation.
Trying to compute f(x,a) differently isn't going to change that, so stencilling methods are never going to work here. But autodiff will always work. This means I win your challenge.
Your other claims are probably gibberish. You need to provide an example.
What don't you understand -- the method of calculating the function is an approximation, thus the AD derivative is dependent on it, and the AD derivative is a derivative of the approximation, not the actual function. Whereas an approximation of the actual derivative is what we are truly after.
> The key take away here is that the map is not the territory. Most nontrivial functions on computers are implemented as some function that that approximates (the map) the mathematical ideal (the territory). Automatic differentiation gives back a completely accurate derivative of the that function (the map) doing the approximation. Furthermore, the accurate derivative of an approximation to the idea (e.g d_my_sin), is less accurate than and approximation to the (ideal) derivative of the ideal (e.g. my_cos). There is no truncation error in the work the AD did; but there is a truncation error in the sense that we are now using a more truncated approximation that we would write ourselves.
AD is great but if you have a calculation method ill-suited for AD, then you'll get shite results. Why is this surprising?
And yeah, stenciling is mostly for PDEs and other state spaces that we don't have a closed form for. It's generally not used for an analytic function. But you can use it for an analytic function if you tailor the stencil to the function. In fact, you'll just yield a truncated Taylor polynomial if you provide a perfect function-specific stencil.
[edit]
I've used Sympy out of laziness, which is inelegant. I don't know how clear this will be.
First of all, I have to change the number 1 in the definition of f(x,a) into a matrix. We need this because we'll be representing dual numbers as matrices, and you can't add the scalar 1 to a matrix:
Sympy doesn't have its own implementation of `sin`, so we'll need to provide one: At the dual number ε, we have f(ε) = f(0) + ε f'(0). The dual number ε will be represented as the matrix As code: Evaluating f on the above matrix gives: Which tells us that f(0) ≈ 1 and f'(0) ≈ 1. Both are correct.If we use finite differencing instead, we get:
So it's claiming incorrectly that the derivative at 0 is 0. No finite differencing scheme can fix this because floating point causes the function f to become constant.Autodiff using the dual numbers has pulled off the seemingly impossible.
[edit: An early typo caused the code to produce an incorrect result. Now fixed, and my claim holds.]