Forward Mode Automatic Differentiation [video]

by matthbergon 2/27/2025, 6:43 AMwith 1 comments

by matthbergon 2/27/2025, 7:58 AM

(Timestamp to skip alternative methods' background info: [0])

An algorithm that yields a computationally precise (no compounding floating point errors from small deltas like with a numerical evaluation) derivative function for all differentiable functions by evaluating it once, with no need for AST parsing or complicated rule compositions (think chain rule, quotient rule, etc. [1] used in symbolic differentiation).

The trick is dual numbers [2] (like a sibling of complex numbers), which have an epsilon component (analogous to complex numbers' i) with rules that É› != 0 and É›*É› = 0. A property of dual numbers is that f'(x) = the dual only component of the result of evaluating f(x + 1É›). It's honestly harder to explain in english than it is to write the code.

The one catch is that you do need to implement dual handling for all mathematical operations your functions will need (just *, +, and / get you really far though). It's not hard though, multiplication is (a, b) => {real: (a.real*b.real), epsilon: (a.real*b.epsilon + b.real*a.epsilon)}, which naturally follows from foil-ing and the two epsilon rules. Trig is a little more complicated, yet easy enough to look up [3].

From there, differentiation is just (f, x) => (f({real: x, epsilon: 1}).epsilon). Given that practically every language supports custom operator implementations for classes (or whatever the language's equivalent terminology is), after a little setup you don't even need to change the syntax you use to write functions, you just get free differentiation handed to you.

[0]: https://youtu.be/QwFLA5TrviI?t=323

[1]: https://en.wikipedia.org/wiki/Differentiation_rules

[2]: https://en.wikipedia.org/wiki/Dual_number

[3]: https://math.stackexchange.com/questions/900541/implementing...