We present a system that aims to recognize activities from an egocentric perspective where the prime source of information are gradient regions around the wearer’s gaze ﬁxations. Inspired by evidence from Vision re- search on the analysis of gaze patterns of people doing manual tasks, we assess how well an existing real-time method for region description performs on a dataset of about 200 video sequences recorded from a wearable gaze tracker. We evaluate the use of the traditional bag of words classiﬁcation approach, however we introduce and evaluate a weighted multiple voting scheme. We model an activity as a record of ﬁxated visual landmarks as the person progresses through the steps. Our method has shown encouraging results on 11 diﬀerent classes of manual and household activities, with our multiple voting scheme increasing the hit rate by nearly twofold.