As humans (or a very advanced AI) we can learn to compute the rate of changeof a function, for example if$$ f(x) : \mathbb{R}\mapsto\mathbb{R} = x^2 $$then \(f\)'s rate of change with respect to a change in \(x\) is$$ \frac{df(x)}{dx} = 2x $$This comes from the definition of a derivative$$ f'(x) = \frac{df(x)}{dx} = \lim_{h\rightarrow 0} \frac{f(x+h)-f(x)}{h} $$i.e the slope of the line \(f(x)\) evaluated between two infinitely close points.It is possible to define a programming language capable of analytic differentiation(Mathematica and Wolfram Alpha),this is not always practical or useful, as such most applications of the derivativeresort to some approximation of the limit definition. Most simply thiscan be done by taking an \(h\), say \(h=0.01\), and plugging in the numbers.This is fine but will lead to errors.

A dual number is a "new" type of number analogous to complex numbers, but nowinstead of using \(a+bi\) where \(i=\sqrt{-1}\) we write$$ a+b\epsilon, \enspace \text{ where } \epsilon^{2}=0, \epsilon \neq 0 \enspace \text{ for } a,b\in\mathbb{R} $$This seemingly trivial academic pursuit it incredibly fruitfull. Consider \(f(x)=x^2\), notice$$f(a+b\epsilon) = a^2 + b^2\cdot(0) + 2ab\epsilon = a^2 + 2ab\epsilon$$again nothing too crazy, yet, but see that$$ f(a) = a^2, \text{ and } f'(a) = 2a $$so if we had taken \(a+\epsilon\) inplace of \(a+b\epsilon\) then we would get$$f(a+\epsilon) = f(a)+f'(a)\epsilon$$This is no accident, in fact this result applies to any \(f\) where \(f'\) exists!$$f(a+b\epsilon) = \sum_{n=0}^{\infty}\frac{f^{(n)}(a)b^n\epsilon^n}{n!} = f(a) + bf'(a)\epsilon + \epsilon^{2}\sum_{n=1}^{\infty}\frac{f^{(n)}(a)b^n\epsilon^n}{n!}$$So in general evaluating a function of a dual number gives you the value of the functionas it's real part and the

```
struct Dual{t1<:Number,t2<:Number} a::t1
b::t2
end
```

Of course the real work is done in defining operators on our objects, inthis case \(+,-,/,*\) for addition, subtraction, division, and multiplication.```
import Base.+, Base.*, Base.^, Base.-, Base./
# addition and subtraction are easy
+(x::Dual,y::Dual) = Dual(x.a+y.a,x.b+y.b)
-(x::Dual,y::Dual) = Dual(x.a-y.a,x.b-y.b)
# multiplication (remember that binomial theorem?)
*(x::Dual,y::Dual) = Dual(x.a*y.a,x.a*y.b+x.b*y.a)
*(x::Number,y::Dual) = Dual(x*y.a,x*y.b)
*(y::Dual,x::Number) = *(x::Number,y::Dual)
# division a bit more of a pain
/(x::Dual,y::Dual) = Dual(x.a/y.a,(x.b*y.a-x.a*y.b)/(y.a^2))
/(x::Number,y::Dual) = /(Dual(x,0.),y)
/(y::Dual,x::Number) = /(y,Dual(x,0.))
# positive integer powers
function ^(x::Dual,n::Int)
y = x
for i in 1:(n-1)
y = y*x
end
return y
end
```

At this point the only curve-ball is division, after a bit of algebrayou'd find the expression$$ \frac{a+b\epsilon}{c+d\epsilon} = \frac{a}{c} + \frac{bc-ad}{c^2}\epsilon $$which is what we have in the code.With these definitiions we can make use of AD for functions using integerpowers multiplication division and summation, e.g for powers```
julia> [Dual(2^b,1)*Dual(2^b,1) for b in 1:8]8-element Array{Dual{Int64},1}:
Dual{Int64}(4, 4)
Dual{Int64}(16, 8)
Dual{Int64}(64, 16)
Dual{Int64}(256, 32)
Dual{Int64}(1024, 64)
Dual{Int64}(4096, 128)
Dual{Int64}(16384, 256)
Dual{Int64}(65536, 512)
```

and for \(f(x)=\frac{1}{x}, f'(x)=-\frac{1}{x^2}\)```
julia> f = (x) -> 1.0/xjulia> g = (x) -> -(1.0/x^2)
julia> [(1.0/Dual(2^b,1),(f(2^b),g(2^b))) for b in 1:8]
8-element Array{Tuple{Dual{Float64,Float64},Tuple{Float64,Float64}},1}:
(Dual{Float64,Float64}(0.5, -0.25), (0.5, -0.25))
(Dual{Float64,Float64}(0.25, -0.0625), (0.25, -0.0625))
(Dual{Float64,Float64}(0.125, -0.015625), (0.125, -0.015625))
(Dual{Float64,Float64}(0.0625, -0.00390625), (0.0625, -0.00390625))
(Dual{Float64,Float64}(0.03125, -0.0009765625), (0.03125, -0.0009765625))
(Dual{Float64,Float64}(0.015625, -0.000244140625), (0.015625, -0.000244140625))
(Dual{Float64,Float64}(0.0078125, -6.103515625e-5), (0.0078125, -6.103515625e-5))
(Dual{Float64,Float64}(0.00390625, -1.52587890625e-5), (0.00390625, -1.52587890625e-5))
```

Great. But what about other functions like \(\cos\), \(\sin\), \(\exp\), \(\log\)and real powers (roots etc)? Generally for any function we would makeuse of it's Taylor seris (or a faster converging approximation). Let'slook at \(\exp\). The Taylor seris simplifies heavily, i.e$$e^{a+b\epsilon} = \sum_{n=0}^{\infty}\frac{(a+b\epsilon)^{n}}{n!}=\sum_{n=0}^{\infty}\frac{1}{n!}\sum_{k=0}^{n}\binom{n}{k}a^{n-k}(b\epsilon)^{k} \\= \sum_{n=0}^{\infty}\frac{1}{n!}\bigl(a^{n} + na^{n-1}b\epsilon\bigr) = e^{a}(1+b\epsilon)$$So \(\exp(x)\) can be found exactly with a few muls!This also means exponentiation to any real power is trivial if we know log...$$(a+b\epsilon)^{x} = e^{x\log(a+b\epsilon)}$$$$ \text{Let } \log(a+b\epsilon) = c+d\epsilon$$$$ (a+b\epsilon)^x = e^{xc}e^{xd\epsilon} = e^{xc}(1+xd\epsilon)$$Thankfully we can (admittedly approximately) find the natural log just from integer powers, \(*\), and \(/\) !$$\log(x) = 2\sum_{k=0}^{\infty}\frac{1}{2k+1}\biggl(\frac{x-1}{x+1}\biggr)^{2k+1}$$This series in particular works nicely for \(x\) about \(1\), whichwe can handle by using \(\log(ab) = \log(a)+\log(b)\) to reduce (or increase) the logandto be about \(1\)```
function Log(z::Dual,k=10) ln2 = 0.6931471805599453094172321214581765680755001343602552541206800094933936219696947156058633269964186875420014810205706857336855202
b = 0.
m = z.a
# use log(ab) = log(a) + log(b) to reduce the magnitute of the logand
if m > 1
# find smallest power of 2 larger
while 2. ^b < m
b += 1
end
elseif m < 1
# find smallest power of -2 smaller
while m < 2. ^b
b -= 1
end
end
# reduce the logand
z = z / 2. ^b
L = Dual(0.0,0.0)
# compute the series with the reduced logand for accuracy
for i in 0:k
L += (1/(2*i+1))*((z-Dual(1,0))/(z+Dual(1,0)))^(2*i+1)
end
# factor in the logand reduction
return 2*L + Dual(b*ln2,0.0)
end
function ^(x::Dual,n::AbstractFloat)
e = 2.7182818284590452353602874713526624977572470936999595749669676277240766303535475945713821785251664274274663919320030599218174135
y = Log(x,10)
c = e^(n*y.a)
return Dual(c,c*(n*y.b))
end
```

This code gives us a quick implementation of dual numbers that we can get some useout of, but it's far from complete. A real implementation in Julia can be foundas a subpackage of ForwardDiff.jl,and a great use case for industry is the very elegant machine learning solution Flux.jl