Paper ID | TEC-6.5 | ||
Paper Title | REFERENCE-BASED VIDEO COLORIZATION WITH MULTI-SCALE SEMANTIC FUSION AND TEMPORAL AUGMENTATION | ||
Authors | Yaxin Liu, Xiaoyan Zhang, Shenzhen University, China; Xiaogang Xu, The Chinese University of Hong Kong, China | ||
Session | TEC-6: Image and Video Processing 2 | ||
Location | Area G | ||
Session Time: | Monday, 20 September, 15:30 - 17:00 | ||
Presentation Time: | Monday, 20 September, 15:30 - 17:00 | ||
Presentation | Poster | ||
Topic | Image and Video Processing: Multiresolution processing of images & Video | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Reference-based video colorization method hallucinates a plausible color version for a gray-scale video by referring distributions of possible colors from an input color frame, which has semantic correspondences with the gray-scale frames. The plausibility of colors and the temporal consistency are two significant challenges in this task. To tackle these challenges, in this paper, we propose an novel Generative Adversarial Network (GAN) with a siamese training framework. Specifically, the siamese training framework allows us to implement temporal feature augmentation, enhancing temporal consistency. Further, to improve the plausibility of colorization results, we propose a multi-scale fusion module that correlates features of reference frames to source frames accurately. Experiments on various datasets show the effect of our method in achieving colorization with higher semantic accuracy compared with existing state-of-the-art approaches, simultaneously keeping the temporal consistency among neighboring frames. |