Paper ID | MLR-APPL-IVSMR-1.9 | ||
Paper Title | GET TO THE POINT: CONTENT CLASSIFICATION OF ANIMATED GRAPHICS INTERCHANGE FORMATS WITH KEY-FRAME ATTENTION | ||
Authors | Yongjuan Ma, Yu Wang, Pengfei Zhu, Junwen Pan, Hong Shi, Tianjin University, China | ||
Session | MLR-APPL-IVSMR-1: Machine learning for image and video sensing, modeling and representation 1 | ||
Location | Area C | ||
Session Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation | Poster | ||
Topic | Applications of Machine Learning: Machine learning for image & video sensing, modeling, and representation | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Animated Graphics Interchange Formats (GIFs) are low-bandwidth short image sequences that can continuously display multiple frames without sound. In this paper, we focus on a new content classification task that is important in real-world applications. A key problem for this task is that some frames in an animated GIF are irrelevant to the label, which may drastically reduce the classification performance. To this end, we first collect a new dataset of Web animated GIFs (WGIF) that includes some typical samples in which only several key-frames are relevant to the ground truth. Then, an attention-based method is designed to learn to produce importance scores of the frames, and subsequently multi-frame predicted scores are merged to obtain the final prediction. Besides, an additional entropy loss is also used to sharpen the attention results to further emphasize the key-frames. Experimental results on WGIF show that the proposed approach significantly outperforms various baseline methods. |