Paper ID | MLR-APPL-BSIP.12 | ||
Paper Title | TURNIP: TIME-SERIES U-NET WITH RECURRENCE FOR NIR IMAGING PPG | ||
Authors | Armand Comas-Massagué, Northeastern University, United States; Tim Marks, Hassan Mansour, Suhas Lohit, Mitsubishi Electric Research Laboratories (MERL), United States; Yechi Ma, Princeton University, United States; Xiaoming Liu, Michigan State University, United States | ||
Session | MLR-APPL-BSIP: Machine learning for biomedical signal and image processing | ||
Location | Area C | ||
Session Time: | Wednesday, 22 September, 08:00 - 09:30 | ||
Presentation Time: | Wednesday, 22 September, 08:00 - 09:30 | ||
Presentation | Poster | ||
Topic | Applications of Machine Learning: Machine learning for biomedical signal and image processing | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Imaging photoplethysmography (iPPG) is the process of estimating the waveform of a person's pulse by processing a video of their face to detect minute color or intensity changes in the skin. Typically, iPPG methods use three-channel RGB video to address challenges due to motion. In situations such as driving, however, illumination in the visible spectrum is often quickly varying (e.g., daytime driving through shadows of trees and buildings) or insufficient (e.g., night driving). In such cases, a practical alternative is to use active illumination and bandpass-filtering from a monochromatic Near-Infrared (NIR) light source and camera. Contrary to learning-based iPPG solutions designed for multi-channel RGB, previous work in single-channel NIR iPPG has been based on hand-crafted models (with only a few manually tuned parameters), exploiting the sparsity of the PPG signal in the frequency domain. In contrast, we propose a modular framework for iPPG estimation of the heartbeat signal, in which the first module extracts a time-series signal from monochromatic NIR face video. The second module consists of a novel time-series U-net architecture in which the passthrough layers have been added a GRU (gated recurrent unit) network. We test our approach on the challenging MR-NIRP Car Dataset, which consists of monochromatic NIR videos taken in both stationary and driving conditions. Our model's iPPG estimation performance on NIR video outperforms both the state-of-the-art model-based method and a recent end-to-end deep learning method that we adapted to monochromatic video. |