Efficient Monocular SLAM Using a Structure-Driven Mapping

Important progress has been achieved in recent years with regards to the monocular SLAM problem, which consists of estimating the 6-D pose of a single camera, whilst building a 3-D map representation of scene structure observed by the camera. Nowadays, there exist various monocular SLAM systems capable of outputting camera and map estimates at camera frame rates over long trajectories and for indoor and outdoor scenarios. These systems are attractive due to their low cost - a consequence of using a conventional camera - and have been widely utilised in different applications such as in augmented and virtual reality.

However, the main utility of the built map has been reduced to work as an effective reference system for robust and fast camera localisation. In order to produce more useful maps, different works have proposed the use of higher-level structures such as lines, planes and even meshes. Planar structure is one of the most popular structures to be incorporated into the map, given that they are abundant in man-made scenes, and because a plane by itself provides implicit semantic cues about the scene structure. Nevertheless, very often planar structure detection is carried out by ad-hoc auxiliary methods delivering a delayed detection and therefore a delayed mapping which becomes a problem when rapid planar mapping is demanded.

My thesis work addresses the problem of planar structure detection and mapping by proposing a novel mapping mechanism called structure-driven mapping. This new approach aims at enabling a monocular SLAM system to perform planar or point mapping according to scene structure observed by the camera. In order to achieve this, we propose to incorporate the plane detection task into the SLAM process. For this purpose, we have developed a novel framework that unifies planar and point mapping under a common parameterisation. This enables map components to evolve according to the incremental visual observations of the scene structure thus providing undelayed planar mapping. Moreover, the plane detection task stops as soon as the camera explores a non planar structure scenario, which avoids wasting unnecessary processing time, starting again as soon as planar structure gets into view.

In my thesis I present a thorough evaluation of this novel approach through simulation experiments and results obtained with real data. I also present a visual odometry application which takes advantage of the efficient way in which the scene structure is mapped by using the novel mapping mechanism presented in this work. Therefore, the results suggest the feasibility of performing simultaneous planar structure detection, localisation and mapping within the same coherent estimation framework.

THESIS: PDF



Efficient Visual Odometry Using a Structure-Driven Temporal Map

International Conference on Robotics and Automation. St Paul-Minnesota, USA. May, 2012

We describe a method for visual odometry using a single camera based on an EKF framework. Previous work has shown that filtering based approaches can achieve accuracy per- formance comparable to that of optimisation methods providing that large numbers of features are used. However, computa- tional requirements are signicantly increased and frame rates are low. We address this by employing higher level structure - in the form of planes - to efficiently parameterise features and so reduce the filter state size and computational load. Moreover, we extend a 1-point RANSAC outlier rejection method to the case of features lying on planes. Results of experiments with both simulated and real-world data demonstrate that the method is effective, achieving comparable accuracy whilst running at significantly higher frame rates.



Video PDF



Unifying Planar and Point Mapping in Monocular SLAM

British Machine Vision Conference. Aberystwyth, UK. September, 2010

Mapping planar structure in vision-based SLAM can increase robustness and significantly improve efficiency of map representation. However, previous systems have implemented planar mapping as an auxiliary process on top of point based mapping, leading to delayed initialisation and increased feature management overhead. We address this by introducing a unified mapping framework based on a common parameterization in which both planar and point features are mapped directly, as and when appropriate according to scene structure. Specifically, no distinction is made between points and planes at initialisation - the 'best' representation emerges after matching has progressed - hence minimizing delay and making the detection of planar structure implicit in the method, avoiding the need for an additional process. We demonstrate the approach within an EKF monocular SLAM system and show its potential for efficient and robust mapping over large areas in both indoor and outdoor environments, including examples of fast relocalisation.

Video PDF



Efficiently Increasing Map Density in Visual SLAM Using Planar Features with Adaptive Measurements

British Machine Vision Conference. London, UK. September, 2009

Point based visual SLAM suffers from a trade off between map density and computational efficiency. With too few mapped points, tracking range is restricted and resistance to occlusion is reduced, whilst expanding the map to give dense representation significantly increases computation. We address this by introducing higher order structure into the map using planar features. The parameterisation of structure allows frame by frame adaptation of measurements according to visibility criteria, increasing the map density without increasing computational load. This facilitates robust camera tracking over wide changes in viewpoint at significantly reduced computational cost. Results of real-time experiments with a hand-held camera demonstrate the effectiveness of the approach.



Video PDF



Appearance Based Extraction of Planar Structure in Monocular SLAM

To be presented in the Scandinavian Conference on Image Analysis. Oslo, Norway. June, 2009

This work concerns the building of enhanced scene maps during real-time monocular SLAM. Specifically, we present a novel algorithm for detecting and estimating planar structure in a scene based on both appearance and geometric information. We adopt a hypothesis testing framework, in which the validity of planar patches within a triangulation of the point based scene map are assessed against an appearance metric. A key contribution is that the metric incorporates the uncertainties available within the SLAM filter through the use of a test statistic assessing error distribution against predicted covariances, hence maintaining a coherent probabilistic formulation. Experimental results indicate that the approach is effective, having good detection and discrimination properties, and leading to convincing planar feature representations.

Video PDF