Login Paper Search My Schedule Paper Index Help

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDMLR-APPL-IVSMR-1.5
Paper Title MULTI-TASK DISTILLATION: TOWARDS MITIGATING THE NEGATIVE TRANSFER IN MULTI-TASK LEARNING
Authors Ze Meng, Xin Yao, Lifeng Sun, Tsinghua University, China
SessionMLR-APPL-IVSMR-1: Machine learning for image and video sensing, modeling and representation 1
LocationArea C
Session Time:Tuesday, 21 September, 13:30 - 15:00
Presentation Time:Tuesday, 21 September, 13:30 - 15:00
Presentation Poster
Topic Applications of Machine Learning: Machine learning for image & video sensing, modeling, and representation
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract In this paper, we propose a top-down mechanism for alleviating the negative transfer in multi-task learning (MTL). MTL aims to learn the general meta-knowledge via sharing inductive bias among tasks for improving the generalization ability. However, there exists a negative transfer problem in MTL, i.e., the performance improvement of a specific task leads to performance degradation on other tasks due to task competition. As a multi-objective optimization problem, MTL usually has a trade-off between the individual performance of different tasks. Inspired by knowledge distillation that transfers knowledge from a teacher model to a student model without significant performance loss, we propose the multi-task distillation to cope with the negative transfer, turning the multi-objective problem into a multi-teacher knowledge distillation problem. Specifically, we first collect task-specific Pareto optimal teacher models and then achieve the high individual performance of each task without a trade-off in the student model by multi-teacher knowledge distillation. Moreover, the multi-task warm-up initialization and the teacher experience pool are proposed to accelerate our method. Extensive experimental results on various benchmark datasets demonstrate that our method outperforms state-of-the-art multi-task learning algorithms and the single-task training baseline.