Paper ID | MLR-APPL-IVSMR-1.4 | ||
Paper Title | VVS: ACTION RECOGNITION WITH VIRTUAL VIEW SYNTHESIS | ||
Authors | Gao Peng, Yong-Lu Li, Hao Zhu, Jiajun Tang, Jin Xia, Cewu Lu, Shanghai Jiao Tong University, China | ||
Session | MLR-APPL-IVSMR-1: Machine learning for image and video sensing, modeling and representation 1 | ||
Location | Area C | ||
Session Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation | Poster | ||
Topic | Applications of Machine Learning: Machine learning for image & video sensing, modeling, and representation | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Action recognition research is usually in the single-view setting. But human action is not single-view based in many cases. A lot of simple action is composed of both body movements from the third-person view, and vision guidance from the first-person view. Therefore, linking two viewpoints of data is critical for action recognition algorithms. Currently, the scale of aligned multi-view dataset is small, which limits the advancement in this direction of research. To alleviate the data limitation, we present the novel Virtual View Synthesis (VVS) module. Instead of training and testing on small scale multi-view data, VVS is first pre-trained on multi-view data to generalize the multi-view ''supervisory attention''. Then it is incorporated into single-view action recognition model to transfer the ability of how to better observe the existing view based on experience from another view. Extensive experiments demonstrate that VVS can improve strong baselines on several single-view action recognition benchmarks. |