Aimed at contextual mapping of environments by exploration, this paper proposes a method that recognizes human activity observed from a moving camera and references this information to a previously mapped environment. We first introduce a novel method that uses sparse features and dense optical flow, to perform dense background subtraction for an agile camera. With the ego-motion disambiguated, we present a method capable of recognizing both external and egocentric human activity. When combined with visual simultaneous localization and mapping (SLAM), this enables augmentation of visual maps with activity tags, highlighting areas of interest within large environments. Association of activity with location introduces the contextual element of purpose to each area of interest.