Login Paper Search My Schedule Paper Index Help

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDSS-MIA.8
Paper Title DEEP FEATURES FUSION WITH MUTUAL ATTENTION TRANSFORMER FOR SKIN LESION DIAGNOSIS
Authors Li Zhou, Yan Luo, University of Massachusetts Lowell, United States
SessionSS-MIA: Special Session: Deep Learning and Precision Quantitative Imaging for Medical Image Analysis
LocationArea A
Session Time:Wednesday, 22 September, 14:30 - 16:00
Presentation Time:Wednesday, 22 September, 14:30 - 16:00
Presentation Poster
Topic Special Sessions: Deep Learning and Precision Quantitative Imaging for Medical Image Analysis
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract Early skin lesion diagnosis is crucial to prevent skin cancer, and deep learning (DL) based methods are well exploited to support dermatologists' diagnosis. The data for the diagnosis tasks include dermoscopic lesion images and textual information. It is a challenge to learn features from the multimodal data to improve diagnostic quality. Inspired by the vision and language integration models in Visual Question Answer (VQA), we present an end-to-end neural network model for skin lesion diagnosis using both images and textual information simultaneously. Specifically, we fine-grained features from the two modalities (image and text) of the dataset by the pre-trained DL models. We propose a novel approach named Mutual Attention Transformer (MAT), which consists of self-attention blocks and guided-attention blocks, to enable the interactions between the features from both modalities concurrently. We then develop a fusion mechanism to integrate the represented features before the final classification output layer. The experimental results on the HAM10000 dataset demonstrate that the proposed method outperforms the state-of-art methods for skin lesion diagnosis.