Improving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition

March 2017

Michael Wray, Davide Moltisanti, Walterio Mayol-Cuevas, Dima Damen

Overview

For the task of action recognition semantic ambiguities between verbs can result in overlaps between classes. Because of this standard classification techniques are unable to correctly learn valid verb labels for each video.

Given a video segment containing an object interaction we model the probability of a verb - out of a list of verbs - being chosen by human annotators as a correct label for the video.

We use a two-stream CNN and test a probabilistic classifier on two public datasets, comprising of 1405 video sequences which we label, and outperform conventional single-label classification by 11% and 6% on the two datasets respectively. We also show that by learning probabilities the method is able outperform majority voting and enables discovery of co-occurring labels.

Publication:

Improving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition, Michael Wray, Davide Moltisanti, Walterio Mayol-Cuevas and Dima Damen (2017). arxiv

Video:

Results

Example results when looking at the top 5 ground truth and predicted verbs:

Results

Dataset:

Bristol Egocentric Object Interactions Dataset