IEEE ICIP 2021 || Anchorage, Alaska, USA || 19-22 September 2021

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper ID

IMT-CIF-1.3

Paper Title

Understanding VQA for Negative Answers through Visual and Linguistic Inference

Authors

Seungjun Jung, Junyoung Byun, Kyujin Shim, Korea Advanced Institute of Science and Technology, Republic of Korea; Sanghyun Hwang, Agency for Defense Development, Republic of Korea; Changick Kim, Korea Advanced Institute of Science and Technology, Republic of Korea

Session

IMT-CIF-1: Computational Imaging 1

Location

Area J

Session Time:

Monday, 20 September, 13:30 - 15:00

Presentation Time:

Monday, 20 September, 13:30 - 15:00

Presentation

Poster

Topic

Computational Imaging Methods and Models: Learning-Based Models

IEEE Xplore Open Preview

Click here to view in IEEE Xplore

Abstract

In order to make Visual Question Answering (VQA) explainable, previous studies not only visualize the attended region of a VQA model, but also generate textual explanations for its answers. However, when the model's answer is ``no," existing methods have difficulty in revealing detailed arguments that lead to that answer. In addition, previous methods are insufficient to provide logical bases when the question requires common sense to answer. In this paper, we propose a novel textual explanation method to overcome the aforementioned limitations. First, we extract keywords that are essential to infer an answer from a question. Second, we utilize a novel Variable-Constrained Beam Search (VCBS) algorithm to generate explanations that best describe the circumstances in images. Furthermore, if the answer to the question is ``yes" or ``no," we apply Natural Langauge Inference (NLI) to determine if contents of the question can be inferred from the explanation using common sense. Our user study, conducted in Amazon Mechanical Turk (MTurk), shows that our proposed method generates more reliable explanations compared to the previous methods. Moreover, by modifying the VQA model's answer through the output of the NLI model, we show that VQA performance increases by 1.1% from the original model.

2021 IEEE International Conference on Image Processing

19-22 September 2021 • Anchorage, Alaska, USA

Imaging Without Borders

2021 IEEE International Conference on Image Processing

19-22 September 2021 • Anchorage, Alaska, USA

My ICIP 2021 Schedule

Paper Detail