Efficient Monocular SLAM Using a Structure-Driven Mapping
Important progress has been achieved in recent years with regards to
the monocular SLAM problem, which consists
of estimating the 6-D pose of a single camera, whilst building a 3-D
map representation of scene structure observed by the camera.
Nowadays, there exist various monocular SLAM systems capable of
outputting camera and map estimates at camera frame rates
over long trajectories and for indoor and outdoor scenarios. These
systems are attractive due to their low cost - a consequence of using
a conventional camera - and have been widely utilised in different
applications such as in augmented and virtual reality.
However, the main utility of the built map has been reduced to work as
an effective reference system for robust and fast
camera localisation. In order to produce more useful maps, different
works have proposed the use of
higher-level structures such as lines, planes and even meshes. Planar
structure is one of the most popular structures
to be incorporated into the map, given that they are abundant in
man-made scenes, and because a plane by itself
provides implicit semantic cues about the scene structure.
Nevertheless, very often planar structure
detection is carried out by ad-hoc auxiliary methods delivering a
delayed detection
and therefore a delayed mapping which becomes a problem
when rapid planar mapping is demanded.
My thesis work addresses the problem of planar structure detection and
mapping by proposing a novel mapping mechanism called
structure-driven mapping. This new approach aims at enabling a
monocular SLAM system to perform planar
or point mapping according to scene structure observed by the camera.
In order to achieve this,
we propose to incorporate the plane detection task into the SLAM
process. For this purpose, we
have developed a novel framework that unifies planar and point mapping
under a common parameterisation.
This enables map components to evolve according to the incremental
visual observations of the scene
structure thus providing undelayed planar mapping. Moreover, the plane
detection task
stops as soon as the camera explores a non planar structure scenario,
which avoids wasting
unnecessary processing time, starting again as soon as planar
structure gets into view.
In my thesis I present a thorough evaluation of this novel approach through
simulation experiments
and results obtained with real data.
I also present a visual odometry application which takes advantage of
the efficient way in which
the scene structure is mapped by using the novel mapping mechanism presented
in this work. Therefore, the results suggest the feasibility of
performing simultaneous
planar structure detection, localisation and mapping within the same
coherent estimation framework.
THESIS: PDF
Efficient Visual Odometry Using a Structure-Driven Temporal Map
International Conference on Robotics and Automation. St Paul-Minnesota, USA. May, 2012
We describe a method for visual odometry using a
single camera based on an EKF framework. Previous work has
shown that filtering based approaches can achieve accuracy per-
formance comparable to that of optimisation methods providing
that large numbers of features are used. However, computa-
tional requirements are signicantly increased and frame rates
are low. We address this by employing higher level structure - in
the form of planes - to efficiently parameterise features and so
reduce the filter state size and computational load. Moreover, we
extend a 1-point RANSAC outlier rejection method to the case
of features lying on planes. Results of experiments with both
simulated and real-world data demonstrate that the method
is effective, achieving comparable accuracy whilst running at
significantly higher frame rates.
Video PDF
Unifying Planar and Point Mapping in Monocular SLAM
British Machine Vision Conference. Aberystwyth, UK. September, 2010
Mapping planar structure in vision-based SLAM can increase robustness and significantly improve efficiency of map representation. However, previous systems have implemented planar mapping as an auxiliary process on top of point based mapping, leading to delayed initialisation and increased feature management overhead. We address this by introducing a unified mapping framework based on a common parameterization in which both planar and point features are mapped directly, as and when appropriate according to scene structure. Specifically, no distinction is made between points and planes at initialisation - the 'best' representation emerges after matching has progressed - hence minimizing delay and making the detection of planar structure implicit in the method, avoiding the need for an additional process. We demonstrate the approach within an EKF monocular SLAM system and show its potential for efficient and robust mapping over large areas in both indoor and outdoor environments, including examples of fast relocalisation.
Video PDF
Efficiently Increasing Map Density in Visual SLAM Using Planar Features with Adaptive Measurements
British Machine Vision Conference. London, UK. September, 2009
Point based visual SLAM suffers from a trade off between map density and computational efficiency. With too few mapped points, tracking range is restricted and resistance to occlusion is reduced, whilst expanding the map to give dense representation significantly increases computation. We address this by introducing higher order structure into the map using planar features. The parameterisation of structure allows frame by frame adaptation of measurements according to visibility criteria, increasing the map density without increasing computational load. This facilitates robust camera tracking over wide changes in viewpoint at significantly reduced computational cost. Results of real-time experiments with a hand-held camera demonstrate the effectiveness of the approach.
Video PDF
Appearance Based Extraction of Planar Structure in Monocular SLAM
To be presented in the Scandinavian Conference on Image Analysis. Oslo, Norway. June, 2009
This work concerns the building of enhanced scene maps during real-time
monocular SLAM. Specifically, we present a novel algorithm for detecting
and estimating planar structure in a scene based on both appearance and
geometric information. We adopt a hypothesis
testing framework, in which the validity of planar patches within a
triangulation of the point based scene map are assessed against an
appearance metric. A key contribution is that the metric incorporates the
uncertainties available within the SLAM filter through the use of
a test statistic assessing error distribution against predicted covariances,
hence maintaining a coherent probabilistic formulation. Experimental results indicate that the approach
is effective, having good detection and discrimination properties, and leading to
convincing planar feature representations.