Paper ID | ARS-6.9 | ||
Paper Title | LIGHTER AND FASTER CROSS-CONCATENATED MULTI-SCALE RESIDUAL BLOCK BASED NETWORK FOR VISUAL SALIENCY PREDICTION | ||
Authors | Sai Phani Kumar Malladi, Jayanta Mukhopadhyay, Indian Institute of Technology Kharagpur, India; Mohamed-Chaker Larabi, University of Poitiers, France; Santanu Chaudhury, Indian Institute of Technology Jodhpur, India | ||
Session | ARS-6: Image and Video Interpretation and Understanding 1 | ||
Location | Area H | ||
Session Time: | Tuesday, 21 September, 15:30 - 17:00 | ||
Presentation Time: | Tuesday, 21 September, 15:30 - 17:00 | ||
Presentation | Poster | ||
Topic | Image and Video Analysis, Synthesis, and Retrieval: Image & Video Interpretation and Understanding | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Existing deep architectures for visual saliency prediction face problems like inefficient feature encoding, larger inference times, and a huge number of model parameters. One possible solution is to make the local and global contextual feature extraction computationally less intensive by a novel lighter architecture. In this work, we propose an end-to-end learnable, inter-scale information sharing residual block-based architecture for saliency prediction. A series of these blocks are used for efficient multi-scale feature extraction followed by a dilated inception module (DIM) and a novel decoder. We name this network as cross-concatenated multi-scale residual (CMR) block-based network, CMRNet. We comprehensively evaluate our architecture on three datasets: SALICON, MIT1003, and MIT300. Experimental results show that our model works at par with other state-of-the-art models. Especially, our model outperforms all the other models with a smaller inference time and a lesser number of model parameters. |