Oscillating Camera SLAM

Mohamed Heshmat Hassan Abdelwahab
4 min readNov 30, 2021

Visual SLAM or VSLAM is Simultaneous Localization And Mapping using visual information. SLAM algorithm had gained much popularity due to its importance for mobile robot applications in unstructured environments. Visual SLAM is attractive because it uses the available onboard cameras to complete the SLAM tasks. SLAM is considered a cornerstone of any autonomous mobile robot.

In visual SLAM, there is a correlation between the features' location in the image stream and the pose of the camera/robot. This correlation drives VSLAM computations. If features give weak clues, the VSLAM performance suers. The problems of visual SLAM can be summarized to be the lack of sufficient physical clues and the lack of an outlier and/or dynamic point rejection mechanism. Therefore, there is a need to actively increase the physical information or clues.

For better VSLAM performance, the robot needs to know, quickly and efficiently, the distance to the landmarks around its locations, as a result, the robot can know where it is. Landmarks distance estimation is not a naive issue, and even it is more difficult using a monocular camera. Monocular cameras, which work as time stereo, need strong physical clues to get the landmarks distances, especially, in the forward and curved robot motion. The convergence rate of depth estimation is very important in visual SLAM systems, especially in the monocular camera case. Features depth can be recovered from the camera motion, but its computation depends on the type of motion. It is well known that the lateral motion is the best motion to compute features depth. On the opposite side, forward and wild turning motions are difficult cases for depth computing [1][2].

To overcome this problem, a new camera acquisition system that help in solving problem of features depth estimation convergence in the forward and curved robot motions. This acquisition system consists of a monocular camera with superimposed lateral oscillations, to improve the physical clues for feature's depth estimate. In contrast to the current direction of image stabilization or anti-shaking camera systems, camera oscillations is argued to enhance VSLAM and improve the features convergence and so the robot localization.

The camera Oscillation idea stems from the well-known biology fact that human and many animals have eyes oscillations, as examples of these animals the mantis and the pigeon [3].

The simulations are done using the publicly available EKF-SLAM toolbox [4]. A robot performing different types of motion in an area of 16 m x 16 m is simulated. This environment consists of 162 landmarks configured into two levels. Half of the landmarks are on the ground and the remaining are 1 meter above the ground. The same features configuration is used through all experiments to cancel the scene features composition effect [5].

Two experiments have been done for the robot moving forward for 8 m, exploring scene features. In the first experiment, a normal static camera was used, while in the second, an oscillating camera was used. Figures below show the map of landmarks, the localization errors and uncertainty using a static and oscillating camera. Localization error refers to error in position and orientation.

On the left, features' convergence using static camera. On the right, features’ convergence using oscillating camera

The below table shows the localization RMSE for each type of motion. The results show that camera oscillation decrease the position error clearly. There is more than 50% error decrease in case of oscillating camera for position errors. For more details about this work, please see this paper[6].

The localization errors for forward and curved robot motions using static and oscillating cameras

References:

1- H. Strasdat, J. M. Montiel, and A. J. Davison, “Visual SLAM: why filter?” Image and Vision Computing, vol. 30, no. 2, pp. 65-77, 2012.

2- S. Huang and G. Dissanayake, “Convergence and consistency analysis for extended Kalman filter based SLAM” IEEE Transactions on Robotics, vol. 23, no. 5, pp. 1036-1049, 2007.

3- K. Kral, “Behavioural-analytical studies of the role of head movements in depth perception in insects, birds and mammals”, Behavioural Processes, vol. 64, no. 1, pp. 1-12, 2003.

4- J. Sola, D. Marquez, J.-M. Codol, and T. Vidal-Calleja, “EKF-SLAM toolbox for matlab”, http://www.joansola.eu/JoanSola/eng/toolbox.html.

5- M. Heshmat and M. Abdellatif, “The effect of feature composition on the localization accuracy of visual SLAM systems”, International Conference on Computer Vision Theory and Applications, VISAPP, pp. 419-424, 2012.

6- M. Heshmat, M. Abdellatif, and H. Abbas, “Improving visual SLAM accuracy through deliberate camera oscillations”, In IEEE International Symposium on Robotic and Sensors Environments, ROSE, pages 154–159, 2013.

--

--

Mohamed Heshmat Hassan Abdelwahab

An experienced international researcher who is keen to further contribute to research, enterprise development, deliverables within AI and robotic solutions