Multiple instance discriminative dictionary learning for action recognition

Hongyang Li, Jun Chen, Zengmin Xu, Huafeng Chen, Ruimin Hu

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016)

EI, CCF B

DOI: 10.1109/ICASSP.2016.7472030

PDF

Abstract

Action recognition from video is a prominent research area in computer vision, with far-reaching applications. Current state-of-the-art action recognition methods is Fisher Vector (FV) coding model based on spatio-temporal local features. Though high dimensional local features have more representative, the high dimensions are challenge for the dictionary learning of FV model. This paper proposes a Multiple Instance Discriminative Dictionary Learning (MIDDL) method for action recognition. We introduce cross-validation method in multiple instance learning procedure, which prevents training from prematurely locking onto erroneous initial instances. In order to balance the positive instance number between positive bags, only the top ranked instances are labeled as positive in the step of iterative training classifiers. Taking these classifiers as discriminative visual words, we get the video global representation based on classifier response. The experimental results demonstrate the effectiveness of applying the learned discriminative classifiers as visual word on challenging action data sets, i.e. UCF50 and HMDB51.