Paper ID | ARS-1.2 | ||
Paper Title | Enhancing Multi-step Action Prediction for Active Object Detection | ||
Authors | Fen Fang, Qianli Xu, Nicolas Gauthier, Liyuan Li, Joo-Hwee Lim, Institute for Infocomm Research, Singapore | ||
Session | ARS-1: Object Detection | ||
Location | Area I | ||
Session Time: | Tuesday, 21 September, 15:30 - 17:00 | ||
Presentation Time: | Tuesday, 21 September, 15:30 - 17:00 | ||
Presentation | Poster | ||
Topic | Image and Video Analysis, Synthesis, and Retrieval: Image & Video Interpretation and Understanding | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | View planning, i.e., predicting next best views, is a key technical component of active object detection. It has been shown that adopting multi-step action in a reinforcement learning setup can boost the efficiency and accuracy of active object detection. However, existing methods suffer from unstable detection outcome when one adopts a naive strategy to combine the Q-values of multiple branches of action advantages, namely, action range and action type. This is partially caused by the lack of independence of these branches. To tackle this issue, we propose a novel mechanism to disentangle action range from action type. We evaluate our method on two public dataset and compare it with competitive benchmarks. It is shown that our method facilitates substantial gain in view planning efficiency, while enhancing detection accuracy. |