23 May 2016 Weighing classes and streams: toward better methods for two-stream convolutional networks
Hoseong Kim, Youngjung Uh, Seunghyeon Ko, Hyeran Byun
Author Affiliations +
Abstract
The emergence of two-stream convolutional networks has boosted the performance of action recognition by concurrently extracting appearance and motion features from videos. However, most existing approaches simply combine the features by averaging the prediction scores from each recognition stream without realizing that some classes favor greater weight for appearance than motion. We propose a fusion method of two-stream convolutional networks for action recognition by introducing objective functions of weights with two assumptions: (1) the scores from streams do not weigh the same and (2) the weights vary across different classes. We evaluate our method by extensive experiments on UCF101, HMDB51, and Hollywood2 datasets in the context of action recognition. The results show that the proposed approach outperforms the standard two-stream convolutional networks by a large margin (5.7%, 4.8%, and 3.6%) on UCF101, HMDB51, and Hollywood2 datasets, respectively.
© 2016 Society of Photo-Optical Instrumentation Engineers (SPIE) 0091-3286/2016/$25.00 © 2016 SPIE
Hoseong Kim, Youngjung Uh, Seunghyeon Ko, and Hyeran Byun "Weighing classes and streams: toward better methods for two-stream convolutional networks," Optical Engineering 55(5), 053108 (23 May 2016). https://doi.org/10.1117/1.OE.55.5.053108
Published: 23 May 2016
Lens.org Logo
CITATIONS
Cited by 5 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Optical engineering

Video surveillance

Convolutional neural networks

Optical flow

RGB color model

Computer vision technology

Back to Top