IEEE ICIP 2021 || Anchorage, Alaska, USA || 19-22 September 2021

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper ID

ARS-6.7

Paper Title

SPEAKER-INDEPENDENT LIPREADING BY DISENTANGLED REPRESENTATION LEARNING

Authors

Qun Zhang, Shilin Wang, Gongliang Chen, Shanghai Jiao Tong University, China

Session

ARS-6: Image and Video Interpretation and Understanding 1

Location

Area H

Session Time:

Tuesday, 21 September, 15:30 - 17:00

Presentation Time:

Tuesday, 21 September, 15:30 - 17:00

Presentation

Poster

Topic

Image and Video Analysis, Synthesis, and Retrieval: Image & Video Interpretation and Understanding

IEEE Xplore Open Preview

Click here to view in IEEE Xplore

Abstract

With the development of the deep learning technology, automatic lipreading based on deep neural network can achieve reliable results for speakers appeared in the training dataset. However, speaker-independent lipreading, i.e. lipreading for unseen speakers, is still a challenging task, especially when the training samples are quite limited. To improve the recognition performance in the speaker-independent scenario, a new deep neural network structure, named Disentangled Visual Speech Recognition Network (DVSR-Net), is proposed in this paper. DVSR-Net is designed to disentangle the identity-related features and the content-related features from the lip image sequence. To further eliminate the identity information that remained in the content features, a content feature refinement stage is designed in network optimization. By this way, the extracted features are closely related to the content information and irrelevant to the various talking style and thus the speech recognition performance for unseen speakers can be improved. Experiments on two widely used datasets have demonstrated the effectiveness of the proposed network in the speaker-independent scenario.

2021 IEEE International Conference on Image Processing

19-22 September 2021 • Anchorage, Alaska, USA

Imaging Without Borders

2021 IEEE International Conference on Image Processing

19-22 September 2021 • Anchorage, Alaska, USA

My ICIP 2021 Schedule

Paper Detail