Sensor Fusion: Part 4

In this post, we’ll add the math and provide implementation for adding image based measurements. Let’s recap the notation and geometry first introduced in part 1 of this series
The state vector in equation 16 of 1 consists of 21 elements.

  1. \delta{P_{I}^{G}} : Error in the position of the IMU in the global reference frame
  2. \delta{V_{I}^{G}} : Error in the velocity of the IMU in the global reference frame
  3. {\delta\theta_{I}^{G}} : Error in the orientation between the IMU and the global reference frame
  4. \delta{P_{C}^{I}} : Error in the position of the camera with respect to the IMU
  5. {\delta\theta_{C}^{I}} : Error in the orientation between the camera and the IMU
  6. \delta{b_{g}} : Error in gyro bias
  7. \delta{b_{a}} : Error in accel bias

This is a 21 element state vector with the camera-IMU offset as one of its elements. We’ll consider a simpler problem first where the IMU is aligned with the camera. Thus, our state vector consists of the errors in the position, velocity and orientation of the IMU (or Camera) wrt the global frame and the gyro and accel bias errors. The imaged feature points provide measurements that will be used to correct our estimates of position, velocity, orientation, gyro and accel bias.

In the previous posts, we already derived the state propagation equations for gyro bias and orientation errors. Let’s now derive the propagation equations for the accel bias, position and velocity. As before, for a variable x, x_m denotes the measured quantity, \hat{x} denotes the estimated quantity and \delta{x} denotes the error. From the laws of motion,

\dot{P_{I}^{G}} = V_{I}^{G}
and therefore,

(1)   \begin{equation*}\delta{\dot{P_{I}^{G}}} = \delta{{V_{I}^{G}}} \end{equation*}

I’m not using bold for P_{I}^{G} as it is obvious from the notation that these are vector quantities.

\delta{V_{I}^{G}} = \hat{V_{I}^{G}} - V_{I}^{G}
Taking the derivative,
\delta{\dot{V_{I}^{G}}} = \dot{\hat{V_{I}^{G}}} - \dot{V_{I}^{G}}
\delta{\dot{V_{I}^{G}}} = \hat{C}^{'G}_{I}\boldsymbol{\hat{a}} - C_{I}^{'G}\boldsymbol{a}

Note that since there are now multiple DCMs, I’m explicitly stating the coordinate systems a given DCM acts on. For example, C_I^G transforms a vector from the global to the IMU-Camera coordinate system. Also, I’m using a prime to denote transpose, so C^{'} = C^T.

From post 2 of this series,
\delta{\dot{V_{I}^{G}}} = \hat{C}^{'G}_{I}(\boldsymbol{a_m} - \boldsymbol{\hat{a_b}}) - (\hat{C}^{'G}_{I} + \hat{C}^{'G}_{I}[\delta\theta]^{'}_{\times})\boldsymbol{a}

Now, since \boldsymbol{a_m} = \boldsymbol{a} + \boldsymbol{a_b} + \boldsymbol{n_r}, \delta{\boldsymbol{a_b}} = \boldsymbol{\hat{a_b}} - \boldsymbol{a_b} and [\delta\theta]^{'}_{\times} = -[\delta\theta]_{\times},

\delta{\dot{V_{I}^{G}}} = -\hat{C}^{'G}_{I}(\delta{\boldsymbol{a_b}}) + \hat{C}^{'G}_{I}[\delta\theta]_{\times}\boldsymbol{a}

\delta{\dot{V_{I}^{G}}} = -\hat{C}^{'G}_{I}(\delta{\boldsymbol{a_b}}) + \hat{C}^{'G}_{I}[\delta\theta]_{\times}\boldsymbol{a}

From the first post, we know that [a]_{\times}\boldsymbol{b} = -[b]_{\times}\boldsymbol{a}. Thus,

(2)   \begin{equation*}\delta{\dot{V_{I}^{G}} = -\hat{C}^{'G}_{I}\delta{\boldsymbol{a_b}} - \hat{C}^{'G}_{I}[a]_{\times}\boldsymbol{\delta\theta}\end{equation*}

Similar to the gyro bias, the accel bias changes very slowly and therefore the error in the accel bias can be assumed to be constant over the time step.

(3)   \begin{equation*} \dot{\delta{\boldsymbol{a_b}}} = 0\end{equation*}

Combining the equations 1,2,3 with the state propagation equations for the orientation and gyro bias errors developed in the previous posts, the entire 15\times15 state transition matrix is given as:

(4)   \begin{equation*}\begin{bmatrix}\dot{\delta{P_{I}^{G}}} \\ \dot{\delta{V_{I}^{G}}} \\ \dot{\delta{\boldsymbol{\theta}}} \\ \dot{\delta{\boldsymbol{g_b}}} \\ \dot{\delta{\boldsymbol{a_b}}}\end{bmatrix} = \begin{bmatrix} 0_{3\times3} & I_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3} \\ 0_{3\times3} & 0_{3\times3} & - \hat{C}^{'G}_{I}[a]_{\times} & 0_{3\times3} & -\hat{C}^{'G}_{I} \\ 0_{3\times3} & 0_{3\times3} & [\phi]_{\times} & I_{3\times3} & 0_{3\times3} \\ 0_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3} \\ 0_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3}\end{bmatrix}\begin{bmatrix}\delta{P_{I}^{G}} \\ \delta{V_{I}^{G}}\\ \delta{\boldsymbol{\theta}} \\ \delta{\boldsymbol{g_b}} \\ \delta{\boldsymbol{a_b}}\end{bmatrix}\end{equation*}


(5)   \begin{equation*}J = \begin{bmatrix} 0_{3\times3} & I_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3} \\ 0_{3\times3} & 0_{3\times3} & - \hat{C}^{'G}_{I}[a]_{\times} & 0_{3\times3} & -\hat{C}^{'G}_{I} \\ 0_{3\times3} & 0_{3\times3} & [\phi]_{\times} & I_{3\times3} & 0_{3\times3} \\ 0_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3} \\ 0_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3} & 0_{3\times3}\end{bmatrix}\end{equation*}

As before, the state transition matrix F is given by:

(6)   \begin{equation*} F \approx I_{15\times15} + J\delta{t}\end{equation*}

The covariance is propagated as:

(7)   \begin{equation*} P_{k\mid k-1}=F_{k}P_{k-1\mid k-1}{F} _{k}^{T}+ {Q} _{k}\end{equation*}

Here k is the time step index.

Let’s now look at the measurement update equations. The simplified geometry given that the IMU and the camera are coincident is shown below:

Note that in our coordinate system, the x axis is oriented along the optical axis of the camera and the z axis points down. This follows the convention used on the MPU6050 IMU. In computer vision, it is more common for the z axis to be pointed along the optical axis.

Let the 3D position of the feature point f_i be represented by x_i, y_i, z_i in the global coordinate system. Denoting the camera calibration matrix by K, the projection of this point, x^{'}_i, y^{'}_i, z^{'}_iis given by:

\begin{bmatrix}x^'_i \\ y^'_i \\ z^'_i\end{bmatrix} = KC_I^G(P_{f_i}^G - P_I^G)

Here K is a 3\times3 matrix:

K = \begin{bmatrix}f & 0 & u_0 \\ 0 & f & v_0 \\ 0 & 0 & 1 \end{bmatrix}

Here f denotes the focal length and u_0 and v_0 the image center. We assume a rectilinear camera with square pixels with no radial and tangential distortion. Those terms can be added easily if needed.

The final image coordinates are obtained by applying the perspective divide.

\begin{bmatrix}u_i \\ v_i\end{bmatrix} = \begin{bmatrix}\frac{y^'_i}{x^'_i} \\ \frac{z^'_i}{x^'_i}\end{bmatrix}

The jacobian matrix consists of the partial derivatives of the image coordinates against the parameters we are trying to estimate, i.e., C_I^G and P_I^G. To obtain the expressions for the partial derivatives, it is useful to think of the perspective projection and coordinate transformation as a function composition and then apply the chain rule. From the chain rule,
D(f(g(x)) = f^{'}(g(x))g^{'}(x)

Here g(\boldsymbol{x}) is the coordinate transformation function that maps the vector \boldsymbol{x} from the global coordinate system to vector \boldsymbol{x^'} in the camera-IMU coordinate system. f(\boldsymbol{x^'}) then applies the perspective divide on the vector \boldsymbol{x^'}.

From straightforward differentiation,

f^{'}(g(\boldsymbol{x})) = f^{'}(\boldsymbol{x}^{'}) = \begin{bmatrix} -\frac{y^{'}_i}{x^{'2}_i} & \frac{1}{x^{'}_i} & 0 \\ -\frac{z^{'}_i}{x^{'2}_i} & 0 & \frac{1}{x^{'}_i}\end{bmatrix}

Let’s denote the matrix represented by f^{'}(\boldsymbol{x}^{'}) by M.

Now let’s consider g^{'}(\boldsymbol{x})

g^{'}(\boldsymbol{x}) = \begin{bmatrix}\frac{\partial{\boldsymbol{x}}}{\partial{C_I^G}}} & \frac{\partial{\boldsymbol{x}}}{\partial{P_I^G}}}\end{bmatrix}

From the vector calculus results derived in part 1,

g^{'}(\boldsymbol{x}) = \begin{bmatrix}KC_G^I[P_{f_i}^G - P_I^G]_{\times}\boldsymbol{\delta\theta} & -KC_G^I\delta{P_{I}^{G}}\end{bmatrix}

The derivatives against the other elements of the state vector are 0. Therefore (after replacing C_G^I and P_G^I by their estimated versions), the measurement update matrix H_{image} is given as:

(8)   \begin{equation*} H_{image} = M\begin{bmatrix}-K\hat{C}_G^I & 0_{3\times3} & K\hat{C}_G^I[P_{f_i}^G - \hat{P}_I^G]_{\times} & 0_{3\times3} & 0_{3\times3} \end{bmatrix}\end{equation*}

Simulation Setup

Our simulation set up consists of a camera and IMU located coincident to each other. In the initial position, the optical axis of the camera is pointed along positive x. The camera is looking at four feature points located a few units away symmetrically around the optical axis. The position of these feature points in the world (global) coordinate system is assumed to be known. During the simulation, the camera + IMU translates and rotates along a predefined trajectory. From this trajectory, simulated sensor measurements are generated by calculating the incremental angular velocities and accelerations and adding bias and noise according to our sensor model. Simulated image measurements are also generated by applying coordinate transformation and perspective projection to the feature points and adding gaussian pixel noise.

State estimation is performed by integrating noisy angular velocities and accelerations (after adjusting for estimated gyro and accel bias) to obtain estimated orientation, position and velocity. The state covariance matrix is propagated according to eq (7). The noise image measurements are then used as a measurement source to correct for the errors in the state vector.

It is well known that integrating noisy accelerometer data to estimate velocity and double integrating to obtain position leads to quick accumulation of errors. The error in the position and orientation will cause the estimated image measurements to drift away from the true measurements (the difference between the two acts as a source of measurement). The video below shows this drift. The camera in the video was held stationary and thus the image projection of the feature points should not move (except for the small perturbation caused by pixel noise).

Let’s also take a look at our simulated trajectory. We use a cube as prop to show the movement of the camera +IMU along the simulated trajectory.

Now let’s look at some of the Matlab code that implements one iteration of the Kalman filter operation.

Let’s first see what happens if no image updates are applied.

We can see that the estimated position very quickly diverges from the true position. Let’s now look at the code for the image measurement update.

The code follows the equation for the image measurement update developed earlier. One difference between image measurement update and accelerometer update discussed in the last post is that we run the image measurement update in a loop until the cost function defined in equation (24) of 1 is lower than a threshold or the maximum number of iterations is reached.

Let’s look at the trajectory with the image updates applied at 1/10 the IMU sampling frequency with two values of image noise sigma

image noise sigma = 0.1
image noise sigma = 0.5

As we can see, adding the image measurement update prevents any systematic error in position from developing and keeps the estimated position wandering around the true position. The amount of wander is higher for higher value of image noise sigma, as expected. The wander around the starting point is likely due to the kalman filter in the process of estimating the gyro and accel biases and converging to a steady state. Once the biases have been estimated and the covariance has converged, the estimated position tracks the true position closely.

The system also correctly estimates the accel and gyro biases. The true and estimated biases for the two different values of image noise sigma are shown below.

Gyro (x) Gyro (y) Gyro (z) Accel (x) Accel (y) Accel (z)
True 0.0127 -0.0177  -0.0067 -0.06 0 0
Estimated (\sigma = 0.1) 0.0126 -0.0176 -0.0067 -0.0581  0.0053 0.0010
Estimated (\sigma = 0.5) 0.0127 -0.0175 -0.0071 -0.0736 0.0031 -0.0092

The residual plots for the position (along y axis) and orientation (roll) are shown below. The residuals mostly lie between the corresponding 3 sigma bounds and thus the filter appears to be performing well.

As suggested in 2, it is better to maintain the covariance matrix in UD factorized form for better numerical stability and efficient calculation of matrix inverse. We have not implemented this optimization.

The full source code is available here.

Implementing a Real System

Our simulation results look quite promising, and thus it is time to implement a real, working system. A real system will present many challenges –

  • Picking suitable feature points in the environment and computing their 3D positions is not easy. Some variant of epipolar geometry based scene geometry computation plus bundle adjustment will have to be implemented.
  • Some way of obtaining the correspondence between 3D feature points and their image projections will have to be devised.
  • Image processing is typically a much slower operation than processing IMU updates. It is quite likely that in a real system, updates from the image processing module are stale by the time they arrive and thus can’t be applied directly to update the current covariance matrix. This time delay will have to be factored in the implementation of the covariance update.
Mirzaei FM, Roumeliotis SI. A Kalman Filter-Based Algorithm for IMU-Camera Calibration: Observability Analysis and Performance Evaluation. IEEE Trans Robot. 2008;24(5):1143-1156. doi:10.1109/tro.2008.2004486
Maybeck P. Stochastic Models: Estimation and Control: Volume 2. Academic Press; 1982.

Be the first to comment

Leave a Reply

Your email address will not be published.