This paper has been accepted for publication in IEEE Conference on Robotics and Automation 2024.
©2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Bharat Joshi and Ioannis RekleitisThe authors are with the Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA. bjoshi@email.sc.edu, yiannisr@cse.sc.edu.This research has been supported in part by the National Science Foundation under grants 1943205 and 2024741. The authors also acknowledge the help of the Woodville Karst Plain Project (WKPP), El Centro Investigador del Sistema Acuífero de Quintana Roo A.C. (CINDAQ), Ricardo Constantino, Project Baseline, and Evan Kornacki in collecting data, providing access to challenging underwater caves, and mentoring us in underwater cave exploration. The authors are also grateful for equipment support by Halcyon Dive Systems, Teledyne FLIR LLC, and KELDAN GmbH lights.
Abstract
This paper presents an extension to visual inertial odometry (VIO) by introducing tightly-coupled fusion of magnetometer measurements. A sliding window of keyframes is optimized by minimizing re-projection errors, relative inertial errors, and relative magnetometer orientation errors. The results of IMU orientation propagation are used to efficiently transform magnetometer measurements between frames producing relative orientation constraints between consecutive frames. The soft and hard iron effects are calibrated using an ellipsoid fitting algorithm. The introduction of magnetometer data results in significant reductions in the orientation error and also in recovery of the true yaw orientation with respect to the magnetic north. The proposed framework operates in all environments with slow-varying magnetic fields, mainly outdoors and underwater. We have focused our work on the underwater domain, especially in underwater caves, as the narrow passage and turbulent flow make it difficult to perform loop closures and reset the localization drift. The underwater caves present challenges to VIO due to the absence of ambient light and the confined nature of the environment, while also being a crucial source of fresh water and providing valuable historical records. Experimental results from underwater caves demonstrate the improvements in accuracy and robustness introduced by the proposed VIO extension.
I Introduction
Magnetic measurements are an often neglected source of information mainly because of their sensitivity to ambient noise; however, there are several situations in which they can provide useful information with minimal cost and low computational overhead. In this work we are targeting the underwater domain with the use of a sensor suite[RahmanOceans2018]. The used sensor suite, comprised of a stereo camera, a 9-axis IMU, a water depth sensor, and a pencil beam mechanically scanning sonar, can be deployed by a human to collect data inside an underwater cave; see Fig. 1. We propose a tightly coupled optimization-based fusion of visual, inertial, and magnetometer information. Magnetometer measurements are added as new factors to the keyframe-based sliding window optimization graph proposed in [okvis]. We leverage the IMU preintegration algorithm from[qin2017vins_mono, forster2016manifold] to efficiently compute magnetometer residuals for all measurements. Since the IMU preintegration terms are already defined in the relative inertial error, the additional computational cost is relatively small. The magnetometer data are sensitive to the local magnetic field and require an explicit calibration procedure. As such, the magnetometer measurements are calibrated to account for soft and hard iron effects using an ellipsoid fitting algorithm. Experiments have shown that after calibration, the local magnetic noise is relatively low throughout the trajectory. Furthermore, magnetic data present an absolute measurement based on the magnetic field of Earth and while noisy they are consistent over long trajectories. Our target application is the mapping of underwater caves.
Mapping underwater cave systems is extremely important for environmental protection, fresh water management[Florida2010], and resource utilization[ZexuanReports2016]. Moreover, caves provide valuable historical evidence as they present an undisturbed time capsule[Abbott2014, macdonald2020paleoindian, rissolo2015novel, de2015ancient], and information about geological processes[KresicMikszewski2013]. Diver centered mapping is a dangerous, labor\hypintensive, tedious, and slow task. In addition to limited visibility, color absorption, hazing, and lightning variations, there is no ambient light and the environment is often cluttered and fragile, making navigation difficult, with inadvertent motion maneuvers often reducing visibility to zero. Furthermore, the confined environment prevents resurfacing in case of a problem or emergency. During operations inside underwater caves, there are two constraining factors. First, the passages are often narrow, and the view on the way in is completely different from the view on the way out, making maneuvering difficult. Second, there is significant water flow that makes staying in one spot to collect data in 360 degrees very challenging. These conditions contribute to reducing the number of loop closures that are feasible.
Moreover, underwater cave systems provide an exciting opportunity to use magnetometer measurements as they are devoid of significant magnetic disturbances due to ferromagnetic materials. The introduction of magnetic field measurements in the state estimation process, proposed in this paper, leads to two significant contributions. First of all, absolute orientation measurements are introduced to orient the produced trajectory with respect to the magnetic north. This makes the produced trajectories compatible with the existing man\hypmade maps of the caves. Second, the introduction of magnetometer data constrains the produced trajectories along the yaw direction, producing much more consistent results, eliminating the orientation drift that plagued earlier deployments[JoshiAUV2022].
To the best of our knowledge, our work is the first to use IMU preintegration to introduce high frequency magnetometer measurements in the optimization framework. The proposed framework has been tested in a variety of underwater caves in Florida and Mexico. Qualitative results demonstrate the alignment of the produced trajectories to the existing maps of the caves, and also a significant reduction in orientation error (drift). The produced trajectories resulted in significantly reduced error, compared to the baseline trajectories obtained using COLMAP[colmap_sfm] (a bundle adjustment, global optimization framework).
II Related Work
Vision\hypbased state estimation has proven to be extremely challenging due to the varying lightning conditions, scattering and haziness from floating particulates, and color variations due to light absorption in the water. Several visual and visual-inertial state estimation packages have exhibited severe failures[QuattriniLiISERVO2016, JoshiIROS2019]. Rahman et al.[RahmanICRA2018] proposed an extension of OKVIS[okvis] incorporating a pencil beam mechanically scanning sonar, a water depth sensor, and loop closure capabilities[RahmanIROS2019a, RahmanIJRR2022]. The framework was then adapted for inexpensive action cameras, GoPro 9,[JoshiICRA2022] producing superior performance. More recent packages such as ORB-SLAM3[orbslam3], OpenVINS[geneva2020openvins], etc. have demonstrated better performance, but there is no in-depth analysis of their accuracy and robustness.
Magnetometers have traditionally been fused with acceleration and angular velocity measurements to estimate the vehicle’s attitude; such systems are termed attitude and heading reference systems (AHRS) [madwick_filter, complimentary_filter, zhao_imu_mag_ultrasonic]. Some recent work on the fusion of magnetic field measurements with a visual sensor has gained traction. Wang et al.[wang_vimo] performed visual, inertial, and magnetometer fusion first by initializing the VIO in the reference frame of Earth and then computing the error between magnetometer measurement and Earth’s magnetic field. The authors included only magnetometer measurements near the keyframe, skipping most of the high-frequency magnetometer measurements. Siebler et al. proposed a particle filter method to use magnetic field distortions for magnetometer calibration and localization [Siebler]; without focusing on continuous state estimation. The closest approach to ours used inter-frame preintegrated magnetometer measurements[caruso_mimu, caruso_mimu_kf]. However, it requires an array of magnetometers measuring magnetic field and it’s gradient. Our work is focused on a single 9-axis IMU without requiring additional hardware enhancements.
The calibration of soft and hard iron offsets for magnetometer is very important as they can introduce huge errors in the estimation process. In [mag_extra_calib] an iterative batch least squares estimation was introduced to calibrate a full three-axis magnetometer. Vasconcelos et al. used maximum likelihood estimation to find the optimal calibration parameters that best fit the reading of the onboard sensors to calibrate the 3-axis strapless magnetometer [mag_calib]. Kok et al.[magnetometer_inertial_calibration] proposed maximal likelihood calibration using orientation estimation from inertial sensors. Taking things to the next step, Solin et al. created a map of the magnetic field in an indoor environment by imposing a Gaussian process (GP) prior to the latent scalar potential of the magnetic field[solin_gaussian].
III Visual-Inertial-Magnetometer Fusion
Visual-Inertial Odometry aims to estimate the pose of the moving camera-imu system by fusing inertial information with images. In this work, we extend visual-inertial odometry to fuse magnetometer measurements in a tightly coupled fashion using IMU preintegration.
III-A Problem Formulation
We consider the visual inertial magnetometer odometry problem where we want to track the state of a sensing system (a handheld underwater sensor suite) equipped with an IMU, a stereo/monocular camera, and a magnetometer. The IMU frame coincides with the body frame ”B” we want to track, and the transformation between the camera and the IMU () is known by calibrating the extrinsic parameters and remains constant throughout the experiment; see Fig. 2 . In this work, we use a 9-axis IMU that includes an accelerometer, a gyroscope, and a magnetometer. As such, we assume that the magnetometer and IMU (gyroscope and accelerometer) axes are aligned and that there is no significant axis misalignment. When using stereo cameras, we assume that they are rigidly attached with known extrinsic parameters between the cameras and between the cameras and the IMU.
Nonlinear keyframe-based sliding window optimization is performed to estimate body poses and 3D landmark positions by minimizing the reprojection error of landmarks seen in the camera frame. Thus, we need to estimate the variables where represents 3D landmark positions visible in the sliding window and holds the system states at camera times 1 to with being the total number of keyframes in the sliding window. The system state at time holds position and orientation in the world frame, velocity in the inertial frame , as well as biases of the gyroscope and accelerometer . The system state can be written as
(1) |
where is the quaternion representation of the orientation .
III-B Fusing visual, inertial and magnetometer measurements
The keyframe-based visual-inertial SLAM is formulated as joint nonlinear optimization that maximizes the posterior probability of the system state . Using the problem formulation proposed in [okvis, RahmanIJRR2022], the following cost function is minimized
(2) |
where denotes the camera frame index and denotes the landmark index. The cost function contains the reprojection , inertial , and marginalization residuals weighted by their respective information matrices.
The reprojection error for the landmark is calculated as where is the set of all the landmarks visible in the keyframe . Here, denotes the camera projection model and the measurements in the image coordinates. For more details, please refer to [okvis]. The inertial residuals are obtained using the IMU preintegration theory proposed in [qin2017vins_mono, quatenrion_sola] which is detailed in Section III-C. We employ a marginalization strategy similar to [okvis] to obtain the marginalization prior error term, . Whenever a new frame is inserted in the optimization window, the marginalization operation is classified into two cases. If the oldest frame in the sliding window is not a keyframe, it is marginalized together with the oldest speed and bias states, and all its landmark measurements are dropped to maintain sparsity. In the case where the oldest frame is a keyframe, only the landmarks that are visible in that frame but not in the most recent keyframe are marginalized out.
We introduce the magnetometer residuals based on IMU preintegration into Eq. 2 to obtain the cost function used in this work as
(3) |
where denotes all magnetometer measurements attached to the system state by the error term. In the remainder of Section III, we use the output of the IMU preintegration algorithm to derive the magnetometer error term and the residual weights .
III-C IMU Preintegration
In this section, we review the IMU preintegration formulation, which in turn will be used to derive the magnetometer residuals in Section III-D. The IMU preintegration formulation is based on [qin2017vins_mono] inspired by continuous-time quaternion kinematics from [quatenrion_sola] and uses IMU bias manipulation according to [forster2016manifold].
The accelerometer and gyroscope measurements in body frame B at time are affected by the additive white noise and slowing varying bias
(4) | ||||
The accelerometer and gyroscope noise is modeled as additive Gaussian noise with and , where being the identity matrix. IMU biases are modeled as a slowly varying random walk with and where and . Most modern 9-axis IMUs provide high-frequency synchronized accelerometer, gyroscope, and magnetometer measurements, as shown in Fig.3. As such, and can be propagated in the time interval using accelerometer and gyroscope measurements. Although in practice IMU measurements may not perfectly synchronize with image timestamps, the sensor setup is calibrated using Kalibr[kalibr] beforehand. Propagation in the world frame requires knowledge of the initial state of the system at time . Whenever the system state changes during optimization, repropagation is needed. Thus, the IMU preintegration is performed in the body frame to avoid repropagation whenever system state is updated during optimization for computational efficiency.
The propagation is performed in body frame instead of the world frame as
(5) | ||||
where , and are the preintegration terms which only depend on the inertial measurements and biases. These preintegration terms along with their covariance can be updated iteratively using discrete accelerometer and gyroscope measurements; see [qin2017vins_mono, forster2016manifold] for details. The preintegrated terms , and are updated using the first-order approximation with respect to their biases if the bias estimates are relatively unchanged as proposed in [forster2016manifold]. Otherwise, we redo the propagation using the updated bias estimates. The inertial error is obtained from Eq. 5 as
(6) |
The preintegration term allows us to estimate the body orientation at time during recursive updates which we will use to obtain the magnetometer error in the next section.
III-D Magnetometer Residuals
Magnetometer and IMU measurements obtained from 9-axis IMU used in attitude and heading reference systems (AHRS) are synchronized as shown in Fig.3. The uncertainty of magnetometer measurements is modeled as Gaussian noise such that the magnetometer measurement is
(7) |
where is the earth’s magnetic field at the location in world coordinate frame which is aligned with earth’s east, north and up (ENU) direction. The magnetometer additive white noise can be expressed as . We estimate the magnetometer white noise similar to IMU [ieee_imu_standard] using the allan-variance plot.
However, the magnetometer measures the superposition of Earth’s magnetic field and local magnetic field because of the presence of magnetic materials in the sensor’s vicinity. Thus, the magnetometer is calibrated before experiments to estimate the soft and hard iron effects. The calibration procedure is explained in the next section. During VIO initialization, we align the orientation of the body frame in the ENU direction. When properly calibrated, the Earth’s magnetic field points in the north-down direction. First, the upward-pointing gravity direction is obtained from the accelerometer measurements. The cross product of the magnetic field and the gravity direction provides the east direction. The remaining axis can be obtained from the cross product of the estimated axes. This is a common procedure used in AHRS systems such as the Madgwick filter[madwick_filter] and the complementary filter[complimentary_filter]. Formulating the magnetometer residual directly in terms of Eq. 4 has one significant drawback that any error during the initial alignment is also included in the estimation process. Thus, we formulate the magnetometer residuals as relative orientation constraint between two consecutive frames.
For the magnetometer residual formulation, we assume that the magnetometer samples are calibrated following the procedure in Section III-E. Using the preintegrated orientation term allows us to use all magnetometer measurements between consecutive frames. Given the current consecutive frame state estimates and , we define the magnetometer residual for measurement at time as
(8) |
where is the propagated orientation between the time interval and transforms the magnetometer measurement from time to . This propagated orientation is the rotation matrix representation of the intermediate preintegration result . Rotation represents relative orientation during the time interval in the body frame and transforms the magnetometer measurement from time to . By expressing the error term in the body frame, we can recursively calculate efficiently as . Thus, the magnetometer error using preintegrated orientation allows us to use all the magnetometer measurements in the time interval to optimize the relative orientation between consecutive frames. The magnetometer residual does not depend on the position estimate; as such, it does not directly affect the state’s position estimate. However, a better orientation estimate will certainly yield a better position accuracy in the long run.
The magnetometer measurement estimated at using quaternion preintegration depends on the gyroscope noise. However, the gyroscope noise is already considered while computing and is significantly smaller than the magnetometer noise. Also, we found that the propagated orientation variance was very small compared to the magnetometer variance. Thus, for simplicity and computational efficiency, we assume that the magnetometer residual weights depend only on the magnetometer sensor noise. Moreover, when the state connected to the magnetometer residuals is marginalized, the magnetometer residuals are added as the marginalization prior along with the inertial and visual residuals.
III-E Soft and Hard Iron Effect Calibration
The full magnetometer measurement model taking into account the magnetic disturbances and sensor imperfections can be modeled[mag_calib] as
(9) |
where S represents the soft iron matrix and represents the hard iron effect. Hard iron effects arise because of the permanent magnetization of the material and also depend on the fixed sensor recording mechanism. In particular, we can think of the hard iron effect as a constant bias . Soft iron effects are due to the magnetization of ferromagnetic materials due to the local magnetic field and depend on the orientation with respect to the local magnetic field[magnetometer_inertial_calibration]. The soft iron effect can be represented as a 33 symmetric matrix S. We do not consider other sources of magnetometer errors that could arise from non-orthogonality of the magnetometer axes or differences in sensitivity along the three magnetometer axes[mag_extra_calib].
Without the soft and hard iron offset, if we rotate a magnetometer sensor then the magnetic field vector falls on the surface of sphere with a radius equal to the magnitude of the earth’s magnetic field. Eq. 9 can be seen as a linear transformation that maps points on a sphere to an ellipsoid[mag_ellipsoid]. Thus, the magnetometer calibration problem can be interpreted as ellipsoid fitting of the points to the sphere surface. We solve the ellipsoid fit problem detailed in Kok et al.[magnetometer_inertial_calibration] to obtain the estimate of and the hard iron offset . For correct calibration, the magnetometer needs to be rotated along all three axes several times. Magnetometer measurements are corrected in the body frame as follows:
(10) |
These corrected measurements are used to calculate magnetometer residuals in the sliding-window optimization as explained in Section III-D. In an indoor environment, these hard and soft iron effects continue to change as we move through the different areas. The underwater cave environment is ideally suited for magnetometer fusion as the likelihood of magnetic disturbance is minimal. As such, we performed the calibration at the beginning and assumed that the local magnetic field did not change significantly. During deployment, we performed calibration at the beginning and at the end of the dive, both calibrations resulted in similar values, validating the above assumption. On older datasets, we did not record an explicit magnetometer calibration sequence. In such a case, these are not enough magnetometer measurements in all directions; thus full mapping from ellipse to sphere is degenerate. Thus, we only calibrate the hard iron effect. Figure 4 presents the ellipsoids fitted during complete calibration for the new sensor and for partial calibration of the old sensor.
IV Experimental Results
We present evaluation results from two experiments in the Devil’s Eye cave system, FL, USA and one experiment in the Dos Ojos Cenote, QR, Mexico. In all experiments the diver enters and exits the cave from the same location.
IV-A Datasets
All the datasets are collected utilizing the sensor suite described in Rahman et al.[RahmanOceans2018]. The first dataset was collected using IDS UI-3251LE cameras in a stereo configuration and Microstrain 3DM-GX5-25 AHRS in the Devil’s cave system, Florida. We call this dataset cave1_short and images are captured at 15Hz. The cave1_short dataset spans approximately 220m. The second dataset was collected with the same sensor suite in Dos Ojos Cenote, QR, Mexico. This trajectory is very long (680m) with a duration of more than 50 minutes. COLMAP was unable to finish the complete trajectory due to computational and memory constraints. Thus, this data set is used only for qualitative evaluation.
The third dataset was collected using a similar sensor suite in monocular setup with Flir BFS-U3-16S7C-C camera and microstrain 3DM-GV7 AHRS. We call this dataset cave1_long with total trajectory length of approximately 506m. Both datasets are very challenging in terms of low brightness, illumination changes due to artificial lighting, and low visibility.
IV-B Results
Due to the absence of GPS or motion capture systems underwater, COLMAP[colmap_sfm] is used to produce baseline trajectories. COLMAP is a Structure-from-Motion (SfM) framework that performs joint optimization of camera poses and map structure using global bundle adjustment. Since one of the datasets only contains monocular images, we consider the COLMAP trajectory up to scale for uniformity. Thus, the trajectories are aligned with the COLMAP baseline trajectories using sim3 alignment[umeyama].
We compare the performance of our magnetometer formulation with the VIO formulation from OKVIS[okvis] in terms of absolute trajectory error (ATE) [absolute_traj_error]. We include the results of the VIO-only case without fusing magnetometer measurements and compare them with the proposed method after sim(3) alignment. Each method is run three times and the absolute trajectory error in terms of degree/meters is reported in Table I. As seen in the table, there is a significant reduction in both translation and rotation RMSE. In particular in the cave1_long dataset, both the rotation and translation error decreased significantly from 15°to 6°and from 14m to 3m.
dataset | length | VIO | VIO+MAG |
cave1_short | 220m | 6.31°/6.45m | 4.48°/4.98m |
cave1_long | 506m | 14.65°/14.57m | 5.83°/3.37m |
We also computed the relative trajectory error (RPE) that computes the relative error between states at different times with the focus on the yaw error [Zhang18iros]. RPE shows the local drift over different sections of the trajectory and is unaffected by previously accumulated error. The relative yaw error calculated for cave1_long trajectory after aligning the initial pose is shown in Fig. 6. We can see that when using the magnetometer the yaw error remains constant over large distances, whereas the VIO accumulates error when distances keep increasing.
Two different deployments from the Devil’s Eye cave system are presented in Fig. 5. The first deployment, a short foray of 100 meters penetration inside the cave, can be seen in the left image, and the drift in the trajectory (blue) without magnetometer data fusion is clear. The visual inertial magnetometer fusion (green) follows much more closely the ground truth trajectory from COLMAP. More dramatic improvements are presented in the middle figure with trajectory of more than 500 meters. Compared to the globally optimized (COLMAP) trajectory there was reduction of positional error from 2.8% to 0.6%. The right image presents a partial map of the cave system with the COLMAP trajectory superimposed. The map is not up to scale, thus some passages are not scaled uniformly. Nonetheless, the trajectory follows the main passages of the cave system.
Figure 7 presents the trajectory resulting from the deployment of the old sensor suite at the Cenote Dos Ojos. During the 50-minute deployment, the sensor covered approximately 680 meters inside a highly decorated cave. The large volume of data made the production of a COLMAP\hypbased ground truth trajectory infeasible due to computation and memory constraints. As such, this trajectory is used only as a qualitative comparison. As can be seen, the VIO diverged almost 100 meters. The trajectory obtained from magnetometer fusion follows the path out much more closely to the path taken on the way in. It is worth noting that the sensor’s trajectory does not follow exactly the same path; however, the confined structure of the cave passage constrains the inbound and the outbound trajectories fairly close.
V Conclusion
In this paper we presented a novel approach for augmenting VIO with magnetic field data. The experimental results demonstrated significant improvements in the accuracy of the resulting trajectories, significant reduction in the orientation error, and also the ability to orient the resulting trajectory with the magnetic north. During the estimation process, the magnetometer and the accelerometers provide absolute attitude information (with respect the magnetic north and the gravity vector).
Future work will explore the performance of the proposed system during shipwreck mapping, where the sensor traverses near metal structures. During initial calibration we expect to recover the true yaw of the sensor with respect to the magnetic north, enabling better positioning of the underwater structure in space. Furthermore, deployments of the proposed system on an autonomous underwater vehicle will require placing the magnetometer at some distance from the motors. A future experimental study will evaluate the magnetic noise levels for popular vehicles, such as the BlueRoV2[bluerov] or the Aqua2 AUV[DudekIROS2005].
\printbibliography