Login Paper Search My Schedule Paper Index Help

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDARS-9.10
Paper Title TIME-LAG AWARE MULTI-MODAL VARIATIONAL AUTOENCODER USING BASEBALL VIDEOS AND TWEETS FOR PREDICTION OF IMPORTANT SCENES
Authors Kaito Hirasawa, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama, Hokkaido University, Japan
SessionARS-9: Interpretation, Understanding, Retrieval
LocationArea I
Session Time:Tuesday, 21 September, 13:30 - 15:00
Presentation Time:Tuesday, 21 September, 13:30 - 15:00
Presentation Poster
Topic Image and Video Analysis, Synthesis, and Retrieval: Image & Video Storage and Retrieval
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract A novel method based on time-lag aware multi-modal variational autoencoder for prediction of important scenes (Tl-MVAE-PIS) using baseball videos and tweets posted on Twitter is presented in this paper. This paper has the following two technical contributions. First, to effectively use heterogeneous data for the prediction of important scenes, we transform textual, visual and audio features obtained from tweets and videos to the latent features. Then Tl-MVAE-PIS can flexibly express the relationships between them in the constructed latent space. Second, since there are time-lags between tweets and the corresponding multiple previous events, Tl-MVAE-PIS considers such time-lags in their relationship estimation for successfully deriving their latent features. Therefore, these two contributions enable accurate important scene prediction. Results of experiments using actual baseball videos and their corresponding tweets show the effectiveness of Tl-MVAE-PIS.