We proposed an innovative hybrid visual SLAM method that combines the advantages and overcome the disadvantages of the binocular and monocular approaches. The advantage of the monocular approach is an easy tracking of feature points and an cabability of building a dense map. However, the monocular approach usually face the scaling factor problem and hence results in inaccurate robot location and built map. Meanwhile, the advantage of binocular method is an accurate robot location and built map without the scaling factor problem. However, the disadvantage of the binocular approach is a lack of map details. We combine the two approaches by adaptively switching between them for each feature point.
We proposed method to separate the 6D camera motion into 3D rotation and 3D translation and estimate them efficiently. Camera rotation and translation can be estimated by watching only far and near feature points, respectively. Our method is suitable for a stereo camera such as a compound stereo catadiotric camera, which can quickly classify feature points into near and far.