Skip to main content

Efficient Monocular SLAM by Using a Structure-Driven Mapping

Jose Martinez Carranza, Efficient Monocular SLAM by Using a Structure-Driven Mapping. PhD thesis. University of Bristol. May 2012. PDF, 27079 Kbytes.


Important progress has been achieved in recent years with regards to the monocular SLAM problem, which consists of estimating the 6-D pose of a single camera, whilst building a 3-D map representation of scene structure observed by the camera. Nowadays, there exist various monocular SLAM systems capable of outputting camera and map estimates at camera frame rates over long trajectories and for indoor and outdoor scenarios. These systems are attractive due to their low cost - a consequence of using a conventional camera - and have been widely utilised in different applications such as in augmented and virtual reality.

However, the main utility of the built map has been reduced to work as an effective reference system for robust and fast camera localisation. In order to produce more useful maps, different works have proposed the use of higher-level structures such as lines, planes and even meshes. Planar structure is one of the most popular structures to be incorporated into the map, given that they are abundant in man-made scenes, and because a plane by itself provides implicit semantic cues about the scene structure. Nevertheless, very often planar structure detection is carried out by ad-hoc auxiliary methods delivering a delayed detection and therefore a delayed mapping which becomes a problem when rapid planar mapping is demanded.

My thesis work addresses the problem of planar structure detection and mapping by proposing a novel mapping mechanism called structure-driven mapping. This new approach aims at enabling a monocular SLAM system to perform planar or point mapping according to scene structure observed by the camera. In order to achieve this, we propose to incorporate the plane detection task into the SLAM process. For this purpose, we have developed a novel framework that unifies planar and point mapping under a common parameterisation. This enables map components to evolve according to the incremental visual observations of the scene structure thus providing undelayed planar mapping. Moreover, the plane detection task stops as soon as the camera explores a non planar structure scenario, which avoids wasting unnecessary processing time, starting again as soon as planar structure gets into view.

In my thesis I present a thorough evaluation of this novel approach through simulation experiments and results obtained with real data. I also present a visual odometry application which takes advantage of the efficient way in which the scene structure is mapped by using the novel mapping mechanism presented in this work. Therefore, the results suggest the feasibility of performing simultaneous planar structure detection, localisation and mapping within the same coherent estimation framework.

Bibtex entry.

Contact details

Publication Admin