This paper is concerned with allowing the user of a wearable, portable, vision system to interact with the visual information using hand movements and gestures. Two example scenarios are explored. The first, in 2D, uses the wearer s hand to both guide an active wearable camera and to highlight objects of interest using a grasping vector. The second is based in 3D, and builds on earlier work which recovers 3D scene structure at video-rate, allowing real-time purposive redirection of the camera to any scene point. Here, a range of hand gestures are used to highlight and select 3D points within the structure and in this instance used to insert 3D graphical objects into the scene. Structure recovery, gesture recognition, scene annotation and augmentation are achieved in parallel and at video-rate.