0%

Dual Number and Auto Derivative of Special Orthogonal Group

Background

Finding the closed-form Jacobian matrix of a residual function is one of the most time consuming works in solving a nonlinear optimization problem. A work around is to use numerical Jacobian. i.e. manually give a small increment in every variable and find out the increment in residual. However the selection of the increment depends on the data magnitude. An inappropriate increment can lead to fluctuation before convergence.

In google ceres-solver, an auto derivative method is introduced to liberate developers from complicated Jacobian debugging. This blog introduces the principle behind the auto derivative and explains a Perspective-N-Point problem demo based on auto derivative and nonlinear optimization.

Dual Number Algebra

The basic idea of auto derivative is to extend the input of a single variable function. Consider function \(y = f(x)\), the input \(x\) is no longer a variable, but an out put of another function with first order slop \( b\varepsilon \). Where \(\varepsilon\) is the basis of the increment, real number \(b\) measures the magnitude of the increment. Therefore, variable \(x\) is expressed as \(x = a + b \varepsilon\).

Define \(\varepsilon ^2 = 0\) to drop infinitesimals with order higher than one.

Addition

\[ (a + b \varepsilon) + (c + d \varepsilon) = (a + c) + (b + d) \varepsilon \]

Identity of Addition:

\[ (a + b \varepsilon) + (0 + 0 \varepsilon) = a + b \varepsilon \]

For the C++ implementation of dual number addition, see mxm/linalg_dual_number.h:93.

Multiplication

\[ (a + b \varepsilon)(c + d \varepsilon) = ac + (ad + bc) \varepsilon \]

Conjugate

Denote \(x = a + b \varepsilon\), its conjugate \(\bar x\) is denote as:

\[ \bar{x} = a - b \varepsilon \]

Then, \(\bar{x} x = a^2 - b^2 \varepsilon^2 = a^2\).

Identity of Multiplication:

\[ (a + b \varepsilon)(1 + 0 \varepsilon) = a + b \varepsilon \]

Inversion of Multiplication:

\[ \begin{align} \frac{1}{a + b \varepsilon} &= \frac{a - b \varepsilon}{(a + b \varepsilon) (a - b \varepsilon)}\\ &= \frac{a - b \varepsilon}{a^2} \\ \end{align} \]

Where \(a \neq 0\). Dual number division uses conjugate property to eliminate the denominator.

For the C++ implementation of dual number multiplication, see mxm/linalg_dual_number.h:22.

Derivative

For all functions that can be expanded as Taylor Series at point \(x_0\) :

\[ \begin{align} f(x) &= f(x_0) + \frac{f^ {\prime} (x_0)}{1!}(x-x_0) + \frac{f^ {\prime \prime} (x_0)}{2!} (x-x_0)^2 + \frac{f^ {\prime \prime \prime} (x_0)}{3!} (x-x_0)^3 + \cdots \\ &= \sum_{\infty }^{n=0} \frac{f^{(n)}(x_0)}{n!} (x-x_0)^n \\ \end{align} \]

Set \(x = a + b \varepsilon\) and expand the series at \(x_0 = a\),

\[ \begin{align} f(a + b \varepsilon) &= f(a) + \frac{f^ {\prime} (a)}{1!} (b\varepsilon) + \frac{f^ {\prime \prime} (a)}{2!} (b\varepsilon) ^2 + \frac{f^ {\prime \prime \prime} (a)}{3!} (b\varepsilon) ^3 + \cdots \\ &= f(a) + f^ {\prime} (a) (b\varepsilon) \tag{2-5}\\ \end{align} \]

i.e.

\[ f^{\prime }(a) \varepsilon = \frac{ f(a + b \varepsilon) - f(a) }{b} \tag{2-6} \]

Dual Number Elementary Functions

In order to use dual number in C++ project, simply overloading operators like + - * / is not enough. C++ built in functions such as sin, cos, exp, log don't handle dual number input. The dual number class have to support all these functions.

Luckily, most elementary functions of dual numbers can be converted to the combination of C++ built in functions. For example:

By applying formula: \(\sin(u + v) = \sin(u) \cos(v) + \cos(u) \sin(v)\) and \(\lim_{x \to 0} \frac{\sin(x)}{x} = 1 \)

\[ \begin{align} \sin(a + b \varepsilon) &= \sin(a) \cos(b \varepsilon) + \cos(a) \sin(b \varepsilon) \\ &= \sin(a) \cdot 1 + \cos(a) b \varepsilon \\ &= \sin(a) + b \cos(a) \varepsilon \end{align} \]

Or by applying formula (2-5): \(f(a + b \varepsilon) = f(a) + f'(a) (b \varepsilon)\)

\[ \sin(a + b \varepsilon) = \sin(a) + b \cos(a) \varepsilon \tag{3-1} \]

Thus the C++ implementation of the sin function for dual number is:

    template <typename DType> DualNumber <DType>
    sin(const DualNumber < DType>& val)
    {
        return {std::sin(val(0)), std::cos(val(0)) * val(1)};
    }
    

Mostly Used Elementary Functions

\[ \begin{align} \sin(a + b \varepsilon) &= \sin(a) + b \cos(a) \varepsilon \\ \arcsin(a + b \varepsilon) &= \arcsin(a) + \frac{b \varepsilon}{\sqrt{1-a^2}} \\ \cos(a + b \varepsilon) &= \cos(a) - b \sin(a) \varepsilon \\ \arccos(a + b \varepsilon) &= \arccos(a) - \frac{b \varepsilon}{\sqrt{1-a^2}} \\ \exp(a + b \varepsilon) &= \exp(a) + b \exp(a) \varepsilon \\ \ln(a + b \varepsilon) &= \ln(a) + \frac{a}{b} \varepsilon \\ (a + b \varepsilon)^ \alpha &= a^ \alpha + (\alpha b) a^{\alpha - 1} \varepsilon \\ \end{align} \]

For the C++ implementation of the elementary functions above, see mxm/linalg_dual_number.h:138.

Derivative of 3D Special Orthogonal Group

Consider a rotation axis \(\phi \in \mathbb{R}^{3} \) and \(\| \phi \| = 1 \), a rotation along axis \(\phi\) with angle \(\theta\) can be expressed as \(f(\theta) = \exp (\theta \phi^ \wedge) \). The \((\cdot )^ \wedge \) operation converts a 3D vector into a skew-symmetric matrix.

Meanwhile, \(f'(\theta) = \phi^ \wedge \exp(\theta \phi^ \wedge)\) and \(f^{(n)}(\theta) = (\phi^ \wedge)^n \exp(\theta \phi^ \wedge)\)

The Taylor Series of \(f(\theta)\) at point \(\theta_0\) is:

\[ \begin{align} f(\theta) &= f(\theta_0) + \frac{f'(\theta_0)}{1!}(\theta - \theta_0) + \frac{f''(\theta_0)}{2!}(\theta - \theta_0)^2 + \cdots \\ &= ( I + \phi^ \wedge (\theta - \theta_0) + \frac{(\phi^ \wedge)^2}{2!}(\theta - \theta_0)^2 + \cdots ) f(\theta_0)\\ &= \left( \sum_{n=0}^{\infty } \frac{(\phi^ \wedge)^n}{n!} (\theta - \theta_0)^n \right) f(\theta_0) \end{align} \]

Let \(\theta = a + b \varepsilon\) and \(\theta_0 = a\), then

\[ f(a + b \varepsilon) = (I + \phi^ \wedge b \varepsilon) f(a) \] \[ \frac{f(a + b \varepsilon) - f(a)}{b} = \phi^ \wedge f(a) \varepsilon \]

The DoF of a 3D rotation is 3, therefore three orthogonal rotation vectors are required to find all derivative of a rotation function. Normally the standard orthogonal basis \([0,0,1]^\top, [0,1,0]^\top, [1,0,0]^\top\) are used as \(\phi\) .

Rodrigues Formula

Rodrigues formula is essential the exponential map from \(\mathfrak{so}(3)\) to \(SO(3)\).

Denote \(\varphi = \theta \phi\)

\[ \exp(\varphi) = I + \frac{\sin(\theta)}{\theta} \varphi^ \wedge + \frac{1-\cos (\theta)}{\theta^2} \varphi^ \wedge \varphi^ \wedge \]

The numerical accuracy decrease when \(\theta \rightarrow 0\). In normal C++ implementation, an if expression is used to eliminate the singularity:

\[ \| \varphi \| < \epsilon \Rightarrow \exp(\varphi) = I \tag{4-3} \]

Where \(\epsilon\) is the machine accuracy. However the expression is incorrect when dealing with dual number input.

Denote

\[ \varphi = \left[ \begin{matrix} a_1 + b_1 \varepsilon \\ a_2 + b_2 \varepsilon \\ a_3 + b_3 \varepsilon \\ \end{matrix} \right] \]

Denote \(a = [a_1, a_2, a_3]^\top\). \(\theta = 0 \Rightarrow \|a\| = 0 \Leftrightarrow \varphi ^ \wedge \varphi ^ \wedge = 0 \) but \( \not \Rightarrow \varphi ^ \wedge = 0\). Therefore, the expression (4-3) should be corrected to

\[ \| \varphi \| < \epsilon \Rightarrow \exp(\varphi) = I + \varphi^ \wedge \tag{4-4} \]

Here is the C++ implementation of the Rodrigues formula that supports dual number calculation.

A Demo with Rotation Auto Derivative

Perspective N Point(PnP) problem tends to find the camera pose by minimizing the reprojection error between 2D points. It is a typical nonlinear optimization problem with the state vector constrained on a 6 DoF manifold. The demo use dual number auto derivative to find the Jacobian matrix and the method converged correctly.

See mxm/cv_3d.h:317 for Jacobian matrix calculation and tests/test_cv_basic.cpp:537 for PnP problem verification pipeline.

Limitations

Consider real number expression \(c = \frac{ab}{b}\). It is easy to find \(c = a\) when \(b \neq 0\) but hard to define the value of \(c\) when \(b = 0\). Similarly for dual number: \(x \neq \sqrt[]{x^2}\) when \(x\) is a purely nonreal dual number, i.e. \(x = b \varepsilon\).

References

  1. Michael Penn, "The strange cousin of the complex numbers - the dual numbers." YouTube, 2022
  2. Wikipedia, "Dual number."