Stop approximating derivatives!

Derivatives are required at the core of many numerical algorithms. Unfortunately, they are usually computed inefficiently and approximately by some variant of the finite difference approach $$f'(x) \approx \frac{f(x+h) - f(x)}{h}, h \text{ small }.$$ This method is inefficient because it requires $$\Omega(n)$$ evaluations of $$f : \mathbb{R}^n \to \mathbb{R}$$ to compute the gradient $$\nabla f(x) = \left( \frac{\partial f}{\partial x_1}(x), \cdots, \frac{\partial f}{\partial x_n}(x)\right)$$, for example. It is approximate because we have to choose some finite, small value of the step length $$h$$, balancing floating-point precision with mathematical approximation error.

What can we do instead?

One option is to explicitly write down a function which computes the exact derivatives by using the rules that we know from Calculus. However, this quickly becomes an error-prone and tedious exercise. There is another way! The field of automatic differentiation provides methods for automatically computing exact derivatives (up to floating-point error) given only the function $$f$$ itself. Some methods use many fewer evaluations of $$f$$ than would be required when using finite differences. In the best case, the exact gradient of $$f$$ can be evaluated for the cost of $$O(1)$$ evaluations of $$f$$ itself. The caveat is that $$f$$ cannot be considered a black box; instead, we require either access to the source code of $$f$$ or a way to plug in a special type of number using operator overloading.

JuliaDiff is an informal organization which aims to unify and document packages written in Julia for evaluating derivatives. The technical features of Julia, namely, multiple dispatch, source code via reflection, JIT compilation, and first-class access to expression parsing make implementing and using techniques from automatic differentiation easier than ever before (in our biased opinion). Packages hosted under the JuliaDiff organization follow the same guidelines as for JuliaOpt; namely, they should be actively maintained, well documented and have a basic testing suite.

Included packages

Below we list the packages that are currently included in the JuliaDiff organization and their testing status on the latest Julia release, if available.

 DualNumbers Implements a Dual number type which can be used for forward-mode automatic differentiation of first derivatives via operator overloading. ForwardDiff A unified package for forward-mode automatic differentiation, combining both DualNumbers and vector-based gradient accumulations. HyperDualNumbers Implements a Hyper number type which can be used for forward-mode automatic differentiation of first and second derivatives via operator overloading. ReverseDiffSource Implements reverse-mode automatic differentiation for gradients and high-order derivatives given user-supplied expressions or generic functions. Accepts a subset of valid Julia syntax, including intermediate assignments. TaylorSeries Implements truncated multivariate power series for high-order integration of ODEs and forward-mode automatic differentiation of arbitrary order derivatives via operator overloading.

Related packages

These Julia packages also provide differentiation functionalities.

 Calculus Provides methods for symbolic differentiation and finite-difference approximations. PowerSeries Implements truncated power series type which can be used for forward-mode automatic differentiation of arbitrary order derivatives via operator overloading. ReverseDiffOverload Implements reverse-mode automatic differentiation by overloading function inputs to extract scalar and vector-valued expression graphs. ReverseDiffSparse Implements reverse-mode automatic differentiation for gradients and sparse Hessian matrices given closed-form expressions. SymEngine Implements symbolic differentiation.

How are they being used?

Packages implementing automatic differentiation techniques are already in use in the broader Julia ecosystem.

• Optim and NLSolve use DualNumbers to compute exact gradients and Jacobians of user-provided functions. Just set autodiff=true. The function must be written to take a generic input vector, e.g., f{T}(x::Vector{T}) or just f(x) instead of f(x::Vector{Float64}).
• JuMP uses ReverseDiffSparse to compute gradients and sparse Hessians, providing them to efficient interior-point solvers.
• Lora, an MCMC engine, uses ReverseDiffSource to compute gradients of statistical models.