Paper ID | MLR-APPL-IVASR-5.9 | ||
Paper Title | RETHINKING GENRE CLASSIFICATION WITH FINE GRAINED SEMANTIC CLUSTERING | ||
Authors | Edward Fish, Jon Weinbren, Andrew Gilbert, University of Surrey, United Kingdom | ||
Session | MLR-APPL-IVASR-5: Machine learning for image and video analysis, synthesis, and retrieval 5 | ||
Location | Area C | ||
Session Time: | Tuesday, 21 September, 15:30 - 17:00 | ||
Presentation Time: | Tuesday, 21 September, 15:30 - 17:00 | ||
Presentation | Poster | ||
Topic | Applications of Machine Learning: Machine learning for image & video analysis, synthesis, and retrieval | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Movie genre classification is an active research area in machine learning; however, the content of movies can vary widely within a single genre label. We expand these 'coarse' genre labels by identifying 'fine-grained' contextual relationships within the multi-modal content of videos. By leveraging pre-trained 'expert' networks, we learn the influence of different combinations of modes for multi-label genre classification. Then, we continue to fine-tune this 'coarse' genre classification network self-supervised, to sub-divide the genres based on the multi-modal content of the videos. Our approach is demonstrated on a new multi-modal 37,866,450 frame, 8,800 movie trailer dataset, MMX-Trailer-20, which includes pre-computed audio, location, motion, and image embeddings. |