We describe a robust system for vision-based SLAM using a single camera which runs in real-time, typically around 30 fps. The key contribution is a novel utilisation of multi-resolution descriptors in a coherent top-down framework. The resulting system provides superior performance over previous methods in terms of robustness to erratic motion, camera shake, and the ability to recover from measurement loss. SLAM itself is implemented within an unscented Kalman filter framework based on a constant position motion model, which is also shown to provide further resilience to non-smooth camera motion. Results are presented illustrating successful SLAM operation for challenging hand-held camera movement within desktop environments.