Paper ID | ARS-7.7 | ||
Paper Title | Multi-Task Learning by a Top-Down Control Network | ||
Authors | Hila Levi, Shimon Ullman, Weizmann Institute of Science, Israel | ||
Session | ARS-7: Image and Video Interpretation and Understanding 2 | ||
Location | Area H | ||
Session Time: | Wednesday, 22 September, 08:00 - 09:30 | ||
Presentation Time: | Wednesday, 22 September, 08:00 - 09:30 | ||
Presentation | Poster | ||
Topic | Image and Video Analysis, Synthesis, and Retrieval: Image & Video Interpretation and Understanding | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | As the range of tasks performed by a general vision system expands, executing multiple tasks accurately and efficiently in a single network has become an important and still open problem. Recent computer vision approaches address this problem by branching networks, or by a channel-wise modulation of the network feature-maps with task specific vectors. We present a novel architecture that uses a dedicated top-down control network to modify the activation of all the units in the main recognition network in a manner that depends on the selected task, image content, and spatial location. We show the effectiveness of our scheme by achieving significantly better results than alternative state-of-the-art approaches on four datasets. We further demonstrate our advantages in terms of task selectivity, scaling the number of tasks and interpretability. |